(Please keep the full text of the question as part of your answer sheet. You can use this file as a template. Type/Insert in your answers after each question. )
Q1: What are the transport protocols used in the following applications? (Be sure to describe/explain the protocol) (10 points)
Simple connection-oriented streaming voice/video without control for pause, stop, resume, forward, backward,
Real Time Transport Control Protocol delivers video and audio over IP networks and is designed for end-to-end real time transmission of media data.
Unreliable, no handshaking, no ordering, no retransmission of data
UDP User Datagram Protocol provides a mechanism that is connectionless and without handshakes.
SS7 transport
The standard transport protocol for Signaling System 7 (SS7) over the Internet is SIGTRAN. A gateway transforms SS7 signals into SIGTRAN packets for IP transmission to either a signaling gateway or a softswitch.SIGTRAN is a protocol stack that contains IP, SCTP, and protocol for support for primitives, which can e used to build more sophisticated interfaces that may be required by another protoco.
Connection-oriented, reliable and ordered 3-way handshaking streaming of data transmission
TCP provides error checked, real time data transmission between a stream of octects for applications requiring reliable data transmission.
Streaming voice/video with control for pause, stop, resume, forward, backward.
DCCP. The Datagram Congrestion Control Protocol is especially useful for controlling network congestion where streaming media is concerned.
Connection-oriented, reliable and ordered 3-way handshaking blocks (chunks) of stream data transmission
STCP/ Stream Transmission Control Protocol separates messages into chunks and each chunk has a header. This protocol can transmit multiple chunks in parallel.
Reservation of transmission bandwidth without real-time traffic feedback
RSVP. The Resource Reservation Protocol reserves network resources in an integrated service. It can be used by routers or hosts to reach specific levels of QoS.
Reservation of transmission bandwidth with real-time traffic feedback
RSVP-TE. This is an extension of the Resource Reservation Protocol which provides extended Traffic Engineering capability.
Providing secure but not reliable transmission
UDP. User Datagram Protocol has no abilities to handshake, so information may be unreliable.
Providing secure and reliable transmission of data
TCP. Transmission Control Protocol provides reliable, error checked and ordered data delivery.
Q2: What mechanism is used to detect/avoid/correct data transmission collisions in Layer 2, such as Ethernet and WiFi? Describe the mechanism in sufficient detail. (10 points)
In the IEEE 802.11 specification, there are two access protocols, namely, basic CSMA/CA and a Request to Send/Clear to Send (RTS/CTS). With the basic CSMA/CA protocol a terminal checks the network medium to see if other terminals are transmitting information. The terminal keeps moving and if the medium is seen to be at an idle state for a time that is in excess of the DIFS. If the medium is sensed busy, the terminal will stop is transmitting of the data until before the end of the session. Before the transmission starts again, the session will start an interval for backoff. When the medium is occupied or busy, the backoff timer stays idle. After a busy period, the timer sits in an idle state and will not resume at a point longer than the DIFS. A terminal starts a new movement when the timer has reached a low point of zero. If the frame is successfully received at the destination, the receiver will send an acknowledgment (ACK) back to the sender after a Short Inter Frame Space (SIFS).
Q3: Describe how a cell phone obtains its IP address in the 3GPP/IMS packet switching system. (10 points)
A cell phone obtains its IP address in the 3GPP/IMS system by proceeding through the LTE protocol. The Network Attachment SubSystem (NASS) is responsible of authentication and authorization based on the user identity, IP address allocation and configuration of the user’s device via the e1 reference point. In case the user device is a Customer Network Gateway (CNG), the e3 reference point is used to configure it. At last, the NASS supports roaming with the help of two reference points: – The e5 interface is used to proxy user authentication requests to the home network. – The e2 reference point enables AFs to retrieve information about the characteristics of the IP-connectivity session used to access such applications (e.g., network location information) from the Connectivity session Location and repository Function (CLF), a sub-component of the NASS. The RACS itself is composed of two primary function blocks.
The Access RACF (A-RACF) is a functional entity that manages resources in the AN and performs admission control, taking into account the user’s access profile retrieved from the NASS through the e4 reference point. A Core RACF (C-RACF) is also defined to manage the aggregation network’s resources (but it is user-unaware). The Service Policy Decision Function (SPDF) is a functional entity acting as a policy decision point for service requests received from an AF. It applies operator-defined policy rules that specify a service’s resource needs, NAT and firewall traversal rules, etc. It doesn’t consider the user identity as it has no direct access to the NASS. The SPDF performs a coordination function between the AF, A-RACF and BGF. It also supports charging. Roaming is supported over the Ri’ interface that links the SPDF in the home network to the one in the visited network.
Q4: What is Distance-Vector Routing? What is Link-State Routing? What is Dijkstra’s Algorithm? What is Bellman-Ford Algorithm? Describe each in detail. (20 points)
In Link-State Routing, the routing table is eliminated and the demand to update the routing tables to take care of the differences in the topology is reduced. In this type, when a source needs to connect to a particular destination, it has to create and maintain the routing path by using some procedures to discover and to maintain the routes. Therefore route deletion process is used to tear down the route. The routes are always regardless of need in the reactive routing protocols with the intake of power and traffic of the signal. On the other hand, as reactive protocols are efficient at power intake and signaling, they suffer more delay while discovering the route. These type of routing protocols have been constantly evolving to become more secure and scalable to provide high QOS.
In Distance-Vector Routing protocol, when any node requires a connection it broadcasts a request to all the nodes in the network. The other ad-hoc nodes receive this request, forward the message to other nodes and record the node that has sent the request. Then the nodes send back the route to the needy node through some temporary routes. After receiving the desired route, the needy node starts to use the route that has minimum number of hops. The entries which are unused in the routing table are deleted after some time. In case of link failure, an error in routing is passed to the node which is transmitting and the procedure is repeated. The difficulty of this routing protocol lies in sustaining the capacity of the network. This is done by reducing the number of messages. Nodes generally use sequence number so that passing same request can be avoided. The request of the routes also have a lifetime or that is, the number of times the request can be retransmitted. It also has another feature with which once a route request fails, it cannot send another request till twice the amount of time has passed since the first request has been sent.
Djikstra's Algorithm is a kind of algorithm that uses shortest tree path approach. A data structure in the shape of a tree is built and this is representative of network paths. The search mechanism fans to multiple nodes, as many as possible and then the search penetrates into the tree. The tree builds searches across nodes in multiple directions and retains only which represent the least amount of cost.
The distributed Bellman-Ford shortest-path algorithm is widely mentioned and used in the early ARPANET and even in today’s gateway-to-gateway routing (GGP in TCP/IP). However, it has a number of major drawbacks. One is that the response of this algorithm to link or node failures, or link weights increasing can be so slow (in an effect known as "bouncing". Nodes may engage in a prolonged and repeated exchange of their distances before converging to the shortest paths. Moreover, if the network is disconnected, the distributed BF algorithm is not guaranteed to terminate; this is the so called "counting-to-infinity" problem, in which each node keeps infinitely increasing its distances to unreachable destinations. Another shortcoming of the distributed BF algorithm is that the paths implied by the routing tables of all nodes taken together can have routing-table loops. A routing-table loop exists when a path to a destination is traced from the routing table of one node to that of another node, and an intermediate node may be visited more than once before the destination is reached.
Q5: Describe at least 5 Big Data analysis/process methods. Which data analysis/process method is most suitable for Big Data. Justify your answer. (10 points)
There are well known and efficient methods for handling big data, all not without their problems. Some of these include accelerated proximal gradient methods, block coordinate descent methods, and alternating directions methods. These methods enjoy low per-iteration complexity, but typically have slow local convergence rates. A fairly straightforward way to overcome the shortcomings of the two classes of methods is to adopt a hybrid approach. It strikes a balance between the efficiency of first order methods and the accuracy enjoyed by second order methods. Hierarchical structure appears in many real-world recommendation and targeting systems. In display-ads targeting, for example, each user is associated with a set of labels such as sports, football, vehicles, luxury cars, etc., and each label contains ads campaign that this user is likely to interact with. It is often the case that the labels are not independent of each other, and that their interdependence can be characterized by a connected (tree) structure,, e.g., football is the child of sports. The task then is to assign each user with a set of labels that both appeal to the user and satisfy the underlying taxonomy. Overlooking such structure not only undermines the predictive performance (loss of information), but also makes the results hard to interpret, e.g., predicting that a user is interested in football but not in sports.
A fourth approach uses Apache Hadoop. Apache Hadoop Distributed File System splits big data files into ports which are managed by the cluster nodes. HDFS also replicates the data parts across several machines to prevent data unavailability in case a machine fails. The technology uses an active monitoring system to re-replicate the data parts when a system failure occurs. In Hadoop programming framework, data is record-oriented. The technology breaks individual input data files into different formats that are unique to an application logic. The cluster node processes compute a subset of the records and the framework schedules the processes in proximity to the actual location of records through the application of distributed file system knowledge. The data operated by a cluster node is selected based on the locality of the data to the node.
A fifth approach uses the Spark protocol. When an action is invoked in a Spark application, the Spark job is lunched and the execution plan is determined. The execution depends on many transformations that are formed to stages. The stage is consist of many tasks that execute different set of data in parallel. Some transformation such as ones that have “ByKey” feature cause shuffling data and produce a new stage. the Spark job is executed in many stages. The first stage reads the input data and map it through the map() function. Then when reduceByKey() is invoked, Spark executes shuffle on the data in order to aggregate all elements with the same key to be in one partition executed by one task. After that, the data is written to the disk and a new stage is started. The number of partitions in the new stage depends on the second argument of reduceByKey() function if provided. If the second argument is not provided, the number of partitions will be the same as the parent stage. When the new stage starts, it reads the shuffled data from the disk and executes the final reduce transformation. The process of forming a new stage continues as long as there is a shuffle needed on the data until all transformations are executed.
References
Basagni, Stefano, et al., eds. Mobile Ad Hoc networking: the cutting edge directions. Vol. 35.
John Wiley & Sons, 2013.
Ciccarelli, Patrick, et al. Introduction to Networking Basics. John Wiley & Sons, 2012.
Conti, Marco, and Silvia Giordano. "Multihop Ad Hoc Networking: The Evolutionary
Path." Mobile Ad Hoc Networking: Cutting Edge Directions, Second Edition (2013): 1-
33.