5.1 The Internet
After studying this section you should be able to do the following:
- Discuss the role of hosts, domains, IP addresses, and the DNS in making the Internet work.
- Describe the main aspects of the Internet infrastructure.
- Understand what a router does and the role these devices play in networking.
- Understand why mastery of Internet infrastructure is critical to modern finance and be able to discuss the risks in automated trading systems.
The Internet is a network of networks—millions of them, actually. If the network at your university, your employer, or your home has Internet access, it connects to an Internet service provider (ISP). Many (but not all) ISPs are big telecommunications companies like Verizon, Comcast, and AT&T. These providers connect to one another, exchanging traffic, and ensuring your messages can get to any other computer that’s online and willing to communicate with you.
The Internet has no center and no one owns it. That’s a good thing. The Internet was designed to be redundant and fault-tolerant—meaning that if one network, connecting wire, or server stops working, everything else should keep on running. Rising from military research and work at educational institutions dating as far back as the 1960s, the Internet really took off in the 1990s, when graphical Web browsing was invented. Much of the Internet’s operating infrastructure was transitioned to be supported by private firms rather than government grants.
TCP/IP: The Internet’s Secret Sauce
OK, we know how to read a Web address, we know that every device connected to the Net needs an IP address, and we know that the DNS can look at a Web address and find the IP address of the machine that you want to communicate with. But how does a Web page, an e-mail, or an iTunes download actually get from a remote computer to your desktop?
For our next part of the Internet journey, we’ll learn about two additional protocols: TCP and IP. These protocols are often written as TCP/IP and pronounced by reading all five letters in a row, “T-C-P-I-P” (sometimes they’re also referred to as the Internet protocol suite). TCP and IP are built into any device that a user would use to connect to the Internet—from handhelds to desktops to supercomputers—and together TCP/IP make Internet working happen.
TCP and IP operate below http and the other application transfer protocols mentioned earlier. TCP (transmission control protocol) works its magic at the start and endpoint of the trip—on both your computer and on the destination computer you’re communicating with. Let’s say a Web server wants to send you a large Web page. The Web server application hands the Web page it wants to send to its own version of TCP. TCP then slices up the Web page into smaller chunks of data called packets (or datagrams). The packets are like little envelopes containing part of the entire transmission—they’re labeled with a destination address (where it’s going) and a source address (where it came from). Now we’ll leave TCP for a second, because TCP on the Web server then hands those packets off to the second half of our dynamic duo, IP.
It’s the job of IP (Internet protocol) to route the packets to their final destination, and those packets might have to travel over several networks to get to where they’re going. The relay work is done via special computers called routers, and these routers speak to each other and to other computers using IP (since routers are connected to the Internet, they have IP addresses, too. Some are even named). Every computer on the Internet is connected to a router, and all routers are connected to at least one (and usually more than one) other router, linking up the networks that make up the Internet.
Routers don’t have perfect, end-to-end information on all points in the Internet, but they do talk to each other all the time, so a router has a pretty good idea of where to send a packet to get it closer to where it needs to end up. This chatter between the routers also keeps the Internet decentralized and fault-tolerant. Even if one path out of a router goes down (a networking cable gets cut, a router breaks, the power to a router goes out), as long as there’s another connection out of that router, then your packet will get forwarded. Networks fail, so good, fault-tolerant network design involves having alternate paths into and out of a network.
Once packets are received by the destination computer (your computer in our example), that machine’s version of TCP kicks in. TCP checks that it has all the packets, makes sure that no packets were damaged or corrupted, requests replacement packets (if needed), and then puts the packets in the correct order, passing a perfect copy of your transmission to the program you’re communicating with (an e-mail server, Web server, etc.).
This progression—application at the source to TCP at the source (slice up the data being sent), to IP (for forwarding among routers), to TCP at the destination (put the transmission back together and make sure it’s perfect), to application at the destination—takes place in both directions, starting at the server for messages coming to you, and starting on your computer when you’re sending messages to another computer.
What Connects the Routers and Computers?
Routers are connected together, either via cables or wirelessly. A cable connecting a computer in a home or office is probably copper (likely what’s usually called an Ethernet cable), with transmissions sent through the copper via electricity. Long-haul cables, those that carry lots of data over long distances, are usually fiber-optic lines—glass lined cables that transmit light (light is faster and travels farther distances than electricity, but fiber-optic networking equipment is more expensive than the copper-electricity kind). Wireless transmission can happen via Wi-Fi (for shorter distances), or cell phone tower or satellite over longer distances. But the beauty of the Internet protocol suite (TCP/IP) is that it doesn’t matter what the actual transmission media are. As long as your routing equipment can connect any two networks, and as long as that equipment “speaks” IP, then you can be part of the Internet.
In reality, your messages likely transfer via lots of different transmission media to get to their final destination. If you use a laptop connected via Wi-Fi, then that wireless connection finds a base station, usually within about three hundred feet. That base station is probably connected to a local area network (LAN) via a copper cable. And your firm or college may connect to fast, long-haul portions of the Internet via fiber-optic cables provided by that firm’s Internet service provider (ISP).
Most big organizations have multiple ISPs for redundancy, providing multiple paths in and out of a network. This is so that if a network connection provided by one firm goes down, say an errant backhoe cuts a cable, other connections can route around the problem.
In the United States (and in most deregulated telecommunications markets), Internet service providers come in all sizes, from smaller regional players to sprawling international firms. When different ISPs connect their networking equipment together to share traffic, it’s called peering. Peering usually takes place at neutral sites called Internet exchange points (IXPs), although some firms also have private peering points. Carriers usually don’t charge one another for peering. Instead, “the money is made” in the ISP business by charging the end-points in a network—the customer organizations and end users that an ISP connects to the Internet. Competition among carriers helps keep prices down, quality high, and innovation moving forward.
Finance Has a Need for Speed
When many folks think of Wall Street trading, they think of the open outcry pit at the New York Stock Exchange (NYSE). But human traders are just too slow for many of the most active trading firms. Over half of all U.S. stock trades and a quarter of worldwide currency trades now happen via programs that make trading decisions without any human intervention (Timmons, 2006). There are many names for this automated, data-driven frontier of finance—algorithmic trading, black-box trading, or high-frequency trading. And while firms specializing in automated, high-frequency trading represent only about 2 percent of the trading firms operating in the United States, they account for about three quarters of all U.S. equity trading volume (Iati, 2009).
Programmers lie at the heart of modern finance. “A geek who writes code—those guys are now the valuable guys” says the former head of markets systems at Fidelity Investments, and that rare breed of top programmer can make “tens of millions of dollars” developing these systems (Berenson, 2009). Such systems leverage data mining and other model-building techniques to crunch massive volumes of data and discover exploitable market patterns. Models are then run against real-time data and executed the instant a trading opportunity is detected. (For more details on how data is gathered and models are built, see Chapter 11 “The Data Asset: Databases, Business Intelligence, and Competitive Advantage”.)
Winning with these systems means being quick—very quick. Suffer delay (what techies call latency) and you may have missed your opportunity to pounce on a signal or market imperfection. To cut latency, many trading firms are moving their servers out of their own data centers and into colocation facilities. These facilities act as storage places where a firm’s servers get superfast connections as close to the action as possible. And by renting space in a “colo,” a firm gets someone else to manage the electrical and cooling issues, often providing more robust power backup and lower energy costs than a firm might get on its own.
Equinix, a major publicly traded IXP and colocation firm with facilities worldwide, has added a growing number of high-frequency trading firms to a roster of customers that includes e-commerce, Internet, software, and telecom companies. In northern New Jersey alone (the location of many of the servers where “Wall Street” trading takes place), Equinix hosts some eighteen exchanges and trading platforms as well as the NYSE Secure Financial Transaction Infrastructure (SFTI) access node.
Less than a decade ago, eighty milliseconds was acceptably low latency, but now trading firms are pushing below one millisecond into microseconds (Schmerken, 2009). So it’s pretty clear that understanding how the Internet works, and how to best exploit it, is of fundamental and strategic importance to those in finance. But also recognize that this kind of automated trading comes with risks. Systems that run on their own can move many billions in the blink of an eye, and the actions of one system may cascade, triggering actions by others.
The spring 2010 “Flash Crash” resulted in a nearly 1,000-point freefall in the Dow Jones Industrial Index, it’s biggest intraday drop ever. Those black boxes can be mysterious—weeks after the May 6th event, experts were still parsing through trading records, trying to unearth how the flash crash happened (Daimler & Davis, 2010). Regulators and lawmakers recognize they now need to understand technology, telecommunications, and its broader impact on society so that they can create platforms that fuel growth without putting the economy at risk.
Watching the Packet Path via Traceroute
Want to see how packets bounce from router to router as they travel around the Internet? Check out a tool called traceroute. Traceroute repeatedly sends a cluster of three packets starting at the first router connected to a computer, then the next, and so on, building out the path that packets take to their destination.
Traceroute is built into all major desktop operating systems (Windows, Macs, Linux), and several Web sites will run traceroute between locations (traceroute.org and visualroute.visualware.com are great places to explore).
The message below shows a traceroute performed between Irish firm VistaTEC and Boston College. At first, it looks like a bunch of gibberish, but if we look closely, we can decipher what’s going on.
The table above shows ten hops, starting at a domain in vistatec.ie and ending in 22.214.171.124 (the table doesn’t say this, but all IP addresses starting with 136.167 are Boston College addresses). The three groups of numbers at the end of three lines shows the time (in milliseconds) of three packets sent out to test that hop of our journey. These numbers might be interesting for network administrators trying to diagnose speed issues, but we’ll ignore them and focus on how packets get from point to point.
At the start of each line is the name of the computer or router that is relaying packets for that leg of the journey. Sometimes routers are named, and sometimes they’re just IP addresses. When routers are named, we can tell what network a packet is on by looking at the domain name. By looking at the router names to the left of each line in the traceroute above, we see that the first two hops are within the vistatec.ie network. Hop 3 shows the first router outside the vistatec.ie network. It’s at a domain named tinet.net, so this must be the name of VistaTEC’s Internet service provider since it’s the first connection outside the vistatec.ie network.
Sometimes routers names suggest their locations (oftentimes they use the same three character abbreviations you’d see in airports). Look closely at the hosts in hops 3 through 7. The subdomains dub20, lon11, lon01, jfk02, and bos01 suggest the packets are going from Dublin, then east to London, then west to New York City (John F. Kennedy International Airport), then north to Boston. That’s a long way to travel in a fraction of a second!
Hop 4 is at tinet.net, but hop 5 is at cogentco.com (look them up online and you’ll find out that cogentco.com, like tinet.net, is also an ISP). That suggests that between those hops peering is taking place and traffic is handed off from carrier to carrier.
Hop 8 is still cogentco.com, but it’s not clear who the unnamed router in hop 9, 126.96.36.199, belongs to. We can use the Internet to sleuth that out, too. Search the Internet for the phrase “IP address lookup” and you’ll find a bunch of tools to track down the organization that “owns” an IP address. Using the tool at whatismyip.com, I found that this number is registered to PSI Net, which is now part of cogentco.com.
Routing paths, ISPs, and peering all revealed via traceroute. You’ve just performed a sort of network “CAT scan” and looked into the veins and arteries that make up a portion of the Internet. Pretty cool!
If you try out traceroute on your own, be aware that not all routers and networks are traceroute friendly. It’s possible that as your trace hits some hops along the way (particularly at the start or end of your journey), three “*” characters will show up at the end of each line instead of the numbers indicating packet speed. This indicates that traceroute has timed out on that hop. Some networks block traceroute because hackers have used the tool to probe a network to figure out how to attack an organization. Most of the time, though, the hops between the source and destination of the traceroute (the steps involving all the ISPs and their routers) are visible.
Traceroute can be a neat way to explore how the Internet works and reinforce the topics we’ve just learned. Search for traceroute tools online or browse the Internet for details on how to use the traceroute command built into your computer.
There’s Another Internet?
If you’re a student at a large research university, there’s a good chance that your school is part of Internet2. Internet2 is a research network created by a consortium of research, academic, industry, and government firms. These organizations have collectively set up a high-performance network running at speeds of up to one hundred gigabits per second to support and experiment with demanding applications. Examples include high-quality video conferencing; high-reliability, high-bandwidth imaging for the medical field; and applications that share huge data sets among researchers.
If your university is an Internet2 member and you’re communicating with another computer that’s part of the Internet2 consortium, then your organization’s routers are smart enough to route traffic through the superfast Internet2 backbone. If that’s the case, you’re likely already using Internet2 without even knowing it!
- TCP/IP, or the Internet protocol suite, helps get perfect copies of Internet transmissions from one location to another. TCP works on the ends of transmission, breaking up transmissions up into manageable packets at the start and putting them back together while checking quality at the end. IP works in the middle, routing packets to their destination.
- Routers are special computing devices that forward packets from one location to the next. Routers are typically connected with more than one outbound path, so in case one path becomes unavailable, an alternate path can be used.
- Many firms in the finance industry have developed automated trading models that analyze data and execute trades without human intervention. Speeds substantially less than one second may be vital to capitalizing on market opportunities, so firms are increasingly moving equipment into collocation facilities that provide high-speed connectivity to other trading systems.
Questions and Exercises
- How can the Internet consist of networks of such physically different transmission media—cable, fiber, and wireless?
- What are the risks in the kinds of automated trading systems described in this section? Conduct research and find an example of where these systems have caused problems for firms and/or the broader market. What can be done to prevent such problems? Whose responsibility is this?
- Find out if your school or employer is an Internet2 member. If it is, run traceroutes to schools that are and are not members of Internet2. What differences do you see in the results?
Berenson, A., “Arrest Over Software Illuminates Wall St. Secret,” New York Times, August 23, 2009.
Daimler E. and G. Davis, “‘Flash Crash’ Proves Diversity Needed in Market Mechanisms,” Pittsburgh Post-Gazette, May 29, 2010.
Iati, R., “The Real Story of Trading Software Espionage,” Advanced Trading, July 10, 2009.
Schmerken, I.,“High-Frequency Trading Shops Play the Colocation Game,” Advanced Trading, October 5, 2009.
Timmons, H., “A London Hedge Fund that Opts for Engineers, Not M.B.A.’s,” New York Times, August 18, 2006.
Business Information Systems: Design an App for That
Raymond Frost, Ohio University
Jacqueline Pike, Duquesne University
Lauren Kenyo, Ohio University
Sarah Pels, Ohio University
Pub Date: 2011
ISBN 13: 9781453311578
Publisher: Saylor Foundation