How the Internet Works (And How SOPA Would Break It)

January 12, 2012

Last week, I explained SoftLayer's stance against SOPA and mentioned that SOPA would essentially require service providers like SoftLayer to "break the Internet" in response to reports of "infringing sites." The technical readers in our audience probably acknowledged the point and moved on, but our non-technical readers (and some representatives in Congress) might have gotten a little confused by the references to DNS, domains and IP addresses.

Given how pervasive the Internet is in our daily lives, you shouldn't need to be "a techie" to understand the basics of what makes the Internet work ... And given the significance of the SOPA legislation, you should understand where the bill would "break" the process. Let's take a high level look at how the Internet works, and from there, we can contrast how it would work if SOPA were to pass.

The Internet: How Sites Are Delivered

  1. You access a device connected in some way to the Internet. This device can be a cell phone, a computer or even a refrigerator. You are connected to the Internet through an Internet Service Provider (ISP) which recognizes that you will be accessing various sites and services hosted remotely. Your ISP manages a network connected to the other networks around the globe ("inter" "network" ... "Internet").
  2. You enter a domain name or click a URL (for this example, we'll use http://www.softlayer.com since we're biased to that site).

Internet Basics

  1. Your ISP will see that you want to access "www.softlayer.com" and will immediately try to find someone/something that knows what "www.softlayer.com" means ... This search is known as an NS (name server) lookup. In this case, it will find that "www.softlayer.com" is associated with several name servers.

Internet Basics

  1. The first of these four name servers to respond with additional information about "softlayer.com" will be used. Domains are typically required to be associated with two or three name servers to ensure if one is unreachable, requests for that domain name can be processed by another.
  2. The name server has Domain Name System (DNS) information that maps "www.softlayer.com" to an Internet Protocol (IP) address. When a domain name is purchased and provisioned, the owner will associate that domain name with an authoritative DNS name server, and a DNS record will be created with that name server linking the domain to a specific IP address. Think of DNS as a phone book that translates a name into a phone number for you.

Internet Basics

  1. When the IP address you reach sees that you requested "www.softlayer.com," it will find the files/content associated with that request. Multiple domains can be hosted on the same IP address, just as multiple people can live at the same street address and answer the phone. Each IP address only exists in a single place at a given time. (There are some complex network tricks that can negate that statement, but in the interest of simplicity, we'll ignore them.)
  2. When the requested content is located (and generated by other servers if necessary), it is returned to your browser. Depending on what content you are accessing, the response from the server can be very simple or very complex. In some cases, the request will return a single HTML document. In other cases, the content you access may require additional information from other servers (database servers, storage servers, etc.) before the request can be completely fulfilled. In this case, we get HTML code in return.

Internet Basics

  1. Your browser takes that code and translates the formatting and content to be displayed on your screen. Often, formatting and styling of pages will be generated from a Cascading Style Sheet (CSS) referenced in the HTML code. The purpose of the style sheet is to streamline a given page's code and consolidate the formatting to be used and referenced by multiple pages of a given website.

Internet Basics

  1. The HTML code will reference sources for media that may be hosted on other servers, so the browser will perform the necessary additional requests to get all of the media the website is trying to show. In this case, the most noticeable image that will get pulled is the SoftLayer logo from this location: http://static2.softlayer.com/images/layout/logo.jpg

Internet Basics

  1. When the HTML is rendered and the media is loaded, your browser will probably note that it is "Done," and you will have successfully navigated to SoftLayer's homepage.

If SOPA were to pass, the process would look like this:

The Internet: Post-SOPA

  1. You access a device connected in some way to the Internet.
  2. You enter a domain name or click a URL (for this example, we'll use http://www.softlayer.com since we're biased to that site).

*The Change*

  1. Before your ISP runs an NS lookup, it would have to determine whether the site you're trying to access has been reported as an "infringing site." If http://www.softlayer.com was reported (either legitimately or illegitimately) as an infringing site, your ISP would not process your request, and you'd proceed to an error page. If your ISP can't find any reference to the domain an infringing site, it would start looking for the name server to deliver the IP address.
  2. SOPA would also enforce filtering from all authoritative DNS provider. If an ISP sends a request for an infringing site to the name server for that site, the provider of that name server would be forced to prevent the IP address from being returned.
  3. One additional method of screening domains would happen at the level of the operator of the domain's gTLD. gTLDs (generic top-level domains) are the ".____" at the end of the domain (.com, .net, .biz, etc.). Each gTLD is managed by a large registry organization, and a gTLD's operator would be required to prevent an infringing site's domain from functioning properly.
  4. If the gTLD registry operator, your ISP and the domain's authoritative name server provider agree that the site you're accessing has not been reported as an infringing site, the process would resume the pre-SOPA process.

*Back to the Pre-SOPA Process*

  1. The domain's name server responds.
  2. The domain's IP address is returned.
  3. The IP address is reached to get the content for http://www.softlayer.com.
  4. HTML is returned.
  5. Your browser translates the HTML into a visual format.
  6. External file references from the HTML are returned.
  7. The site is loaded.

The proponents of SOPA are basically saying, "It's difficult for us to keep up with and shut down all of the instances of counterfeiting and copyright infringement online, but it would be much easier to target the larger sites/providers 'enabling' users to access that (possible) infringement." Right now, the DMCA process requires a formal copyright complaint to be filed for every instance of infringement, and the providers who are hosting the content on their network are responsible for having that content removed. That's what our abuse team does full-time. It's a relatively complex process, but it's a process that guarantees us the ability to investigate claims for legitimacy and to hear from our customers (who hear from their customers) in response to the claims.

SOPA does not allow for due process to investigate concerns. If a site is reported to be an infringing site, service providers have to do everything in their power to prevent users from getting there.

-@toddmitchell