How the Internet Works (And How SOPA Would Break It)

January 12, 2012

Last week, I explained SoftLayer's stance against SOPA and mentioned that SOPA would essentially require service providers like SoftLayer to "break the Internet" in response to reports of "infringing sites." The technical readers in our audience probably acknowledged the point and moved on, but our non-technical readers (and some representatives in Congress) might have gotten a little confused by the references to DNS, domains and IP addresses.

Given how pervasive the Internet is in our daily lives, you shouldn't need to be "a techie" to understand the basics of what makes the Internet work ... And given the significance of the SOPA legislation, you should understand where the bill would "break" the process. Let's take a high level look at how the Internet works, and from there, we can contrast how it would work if SOPA were to pass.

The Internet: How Sites Are Delivered

  1. You access a device connected in some way to the Internet. This device can be a cell phone, a computer or even a refrigerator. You are connected to the Internet through an Internet Service Provider (ISP) which recognizes that you will be accessing various sites and services hosted remotely. Your ISP manages a network connected to the other networks around the globe ("inter" "network" ... "Internet").
  2. You enter a domain name or click a URL (for this example, we'll use http://www.softlayer.com since we're biased to that site).

Internet Basics

  1. Your ISP will see that you want to access "www.softlayer.com" and will immediately try to find someone/something that knows what "www.softlayer.com" means ... This search is known as an NS (name server) lookup. In this case, it will find that "www.softlayer.com" is associated with several name servers.

Internet Basics

  1. The first of these four name servers to respond with additional information about "softlayer.com" will be used. Domains are typically required to be associated with two or three name servers to ensure if one is unreachable, requests for that domain name can be processed by another.
  2. The name server has Domain Name System (DNS) information that maps "www.softlayer.com" to an Internet Protocol (IP) address. When a domain name is purchased and provisioned, the owner will associate that domain name with an authoritative DNS name server, and a DNS record will be created with that name server linking the domain to a specific IP address. Think of DNS as a phone book that translates a name into a phone number for you.

Internet Basics

  1. When the IP address you reach sees that you requested "www.softlayer.com," it will find the files/content associated with that request. Multiple domains can be hosted on the same IP address, just as multiple people can live at the same street address and answer the phone. Each IP address only exists in a single place at a given time. (There are some complex network tricks that can negate that statement, but in the interest of simplicity, we'll ignore them.)
  2. When the requested content is located (and generated by other servers if necessary), it is returned to your browser. Depending on what content you are accessing, the response from the server can be very simple or very complex. In some cases, the request will return a single HTML document. In other cases, the content you access may require additional information from other servers (database servers, storage servers, etc.) before the request can be completely fulfilled. In this case, we get HTML code in return.

Internet Basics

  1. Your browser takes that code and translates the formatting and content to be displayed on your screen. Often, formatting and styling of pages will be generated from a Cascading Style Sheet (CSS) referenced in the HTML code. The purpose of the style sheet is to streamline a given page's code and consolidate the formatting to be used and referenced by multiple pages of a given website.

Internet Basics

  1. The HTML code will reference sources for media that may be hosted on other servers, so the browser will perform the necessary additional requests to get all of the media the website is trying to show. In this case, the most noticeable image that will get pulled is the SoftLayer logo from this location: http://static2.softlayer.com/images/layout/logo.jpg

Internet Basics

  1. When the HTML is rendered and the media is loaded, your browser will probably note that it is "Done," and you will have successfully navigated to SoftLayer's homepage.

If SOPA were to pass, the process would look like this:

The Internet: Post-SOPA

  1. You access a device connected in some way to the Internet.
  2. You enter a domain name or click a URL (for this example, we'll use http://www.softlayer.com since we're biased to that site).

*The Change*

  1. Before your ISP runs an NS lookup, it would have to determine whether the site you're trying to access has been reported as an "infringing site." If http://www.softlayer.com was reported (either legitimately or illegitimately) as an infringing site, your ISP would not process your request, and you'd proceed to an error page. If your ISP can't find any reference to the domain an infringing site, it would start looking for the name server to deliver the IP address.
  2. SOPA would also enforce filtering from all authoritative DNS provider. If an ISP sends a request for an infringing site to the name server for that site, the provider of that name server would be forced to prevent the IP address from being returned.
  3. One additional method of screening domains would happen at the level of the operator of the domain's gTLD. gTLDs (generic top-level domains) are the ".____" at the end of the domain (.com, .net, .biz, etc.). Each gTLD is managed by a large registry organization, and a gTLD's operator would be required to prevent an infringing site's domain from functioning properly.
  4. If the gTLD registry operator, your ISP and the domain's authoritative name server provider agree that the site you're accessing has not been reported as an infringing site, the process would resume the pre-SOPA process.

*Back to the Pre-SOPA Process*

  1. The domain's name server responds.
  2. The domain's IP address is returned.
  3. The IP address is reached to get the content for http://www.softlayer.com.
  4. HTML is returned.
  5. Your browser translates the HTML into a visual format.
  6. External file references from the HTML are returned.
  7. The site is loaded.

The proponents of SOPA are basically saying, "It's difficult for us to keep up with and shut down all of the instances of counterfeiting and copyright infringement online, but it would be much easier to target the larger sites/providers 'enabling' users to access that (possible) infringement." Right now, the DMCA process requires a formal copyright complaint to be filed for every instance of infringement, and the providers who are hosting the content on their network are responsible for having that content removed. That's what our abuse team does full-time. It's a relatively complex process, but it's a process that guarantees us the ability to investigate claims for legitimacy and to hear from our customers (who hear from their customers) in response to the claims.

SOPA does not allow for due process to investigate concerns. If a site is reported to be an infringing site, service providers have to do everything in their power to prevent users from getting there.

-@toddmitchell

Comments

January 12th, 2012 at 7:29pm

I'm trying to reconcile "Before your ISP runs an NS lookup, it would have to determine whether the site you’re trying to access has been reported as an “infringing site.” If [so, then] your ISP would not process your request."

And "If a site is reported to be an infringing site, service providers have to do everything in their power to prevent users from getting there."

With the Post-SOPA URL request/ Page retrieval process you describe, all an individual who wanted to visit the infringing Softlayer.com site would need to do is enter the raw IP address, bypassing the DNS retrieval step.

But "everything in their power" could be construed to mean something as complicated as detecting the entry of an IP adress (instead of a URL) and then using a reverse DNS to determine that the user was attempting to reach an infringing site, and blocking the subsequent... or something as simple as doing a periodic DNS lookup of each infringing site on record and blocking all packets originating from those IPs.

Of course, even those methods can be bypassed, and the effect would likely be to create a black market for the workarounds themselves, either as client software or remote services like VPN or even IP reverse-spoofing...

All in all it's a farce, and will do nothing but cost governments and ISPs money.

January 13th, 2012 at 9:02am

Those are great points, bughunter, and they are consistent with why we believe the bill would be inefficient when it comes to enforcement. DNS works because everyone agrees that DNS works. If ISPs are required to police DNS, people intent on going to infringing sites will circumvent it, and ultimately DNS won't work because people won't agree that DNS works any more.

Given the finite resource of IPv4 space, more and more sites will share single IPv4 addresses, making it impossible to take blanket action on a single IP (considering the collateral damage it would cause to hundreds - if not thousands - of legitimate sites).

The hurdles that are being put up for "infringing site" owners and users are just hurdles, not security fences. The biggest worry is that the bad guys will (relatively easily) find a way to circumvent the actions required by the bill, and the only people affected will be good guys who fall victim to bad guys who take advantage of the legislation and use it to their own gain.

January 13th, 2012 at 10:03pm

I'm proud of SoftLayer taking a stand against SOPA.

Don't let politicians that know nothing about how the internet works break it. They dont understand how much of an impact it would be for the hosting business in the US.

Keep up the fight never give up.

January 14th, 2012 at 8:29am

Finally a good explanation of why SOPA won't work. Thanks Kevin!

January 14th, 2012 at 8:45pm

Todd, thanks for these posts on SOPA.

As a software developer and writer I certainly understand the need to protect intellectual property rights, but the means by which we do so has to be balanced against effectiveness, cost, and fundamental human rights.

Kevin has it exactly right: DNS works by mutual consent. Any attempts to enforce changes without that consent risk causing the collapse of the system. Notably, there are *already* browser plugins available that will circumvent the SOPA measures.

SOPA will not be effective, and is likely to cause considerable collateral damage. It's bad legislation from every angle.

January 20th, 2012 at 3:29pm

Thank you for taking the right side in this big problem. I am not from America but I will try to keep my severs in this great hosting unless America turns into a digital hell.

If America accepts SOPA then some of persons, maybe... a few hundreds of persons? ... would get even richer and happy because we would be forced to pay them, right, great! Yes, that is good for ... 0,000001% of the world... and bad for the other 99,99999%. Mmmmm... I think it is not a good deal.

January 22nd, 2012 at 8:18am

thanks for these posts on SOPA.

SOPA won’t work. :)

Leave a Reply

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • You can enable syntax highlighting of source code with the following tags: <pre>, <blockcode>, <bash>, <c>, <cpp>, <drupal5>, <drupal6>, <java>, <javascript>, <php>, <python>, <ruby>. The supported tag styles are: <foo>, [foo].
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

Comments

January 12th, 2012 at 7:29pm

I'm trying to reconcile "Before your ISP runs an NS lookup, it would have to determine whether the site you’re trying to access has been reported as an “infringing site.” If [so, then] your ISP would not process your request."

And "If a site is reported to be an infringing site, service providers have to do everything in their power to prevent users from getting there."

With the Post-SOPA URL request/ Page retrieval process you describe, all an individual who wanted to visit the infringing Softlayer.com site would need to do is enter the raw IP address, bypassing the DNS retrieval step.

But "everything in their power" could be construed to mean something as complicated as detecting the entry of an IP adress (instead of a URL) and then using a reverse DNS to determine that the user was attempting to reach an infringing site, and blocking the subsequent... or something as simple as doing a periodic DNS lookup of each infringing site on record and blocking all packets originating from those IPs.

Of course, even those methods can be bypassed, and the effect would likely be to create a black market for the workarounds themselves, either as client software or remote services like VPN or even IP reverse-spoofing...

All in all it's a farce, and will do nothing but cost governments and ISPs money.

January 13th, 2012 at 9:02am

Those are great points, bughunter, and they are consistent with why we believe the bill would be inefficient when it comes to enforcement. DNS works because everyone agrees that DNS works. If ISPs are required to police DNS, people intent on going to infringing sites will circumvent it, and ultimately DNS won't work because people won't agree that DNS works any more.

Given the finite resource of IPv4 space, more and more sites will share single IPv4 addresses, making it impossible to take blanket action on a single IP (considering the collateral damage it would cause to hundreds - if not thousands - of legitimate sites).

The hurdles that are being put up for "infringing site" owners and users are just hurdles, not security fences. The biggest worry is that the bad guys will (relatively easily) find a way to circumvent the actions required by the bill, and the only people affected will be good guys who fall victim to bad guys who take advantage of the legislation and use it to their own gain.

January 13th, 2012 at 10:03pm

I'm proud of SoftLayer taking a stand against SOPA.

Don't let politicians that know nothing about how the internet works break it. They dont understand how much of an impact it would be for the hosting business in the US.

Keep up the fight never give up.

January 14th, 2012 at 8:29am

Finally a good explanation of why SOPA won't work. Thanks Kevin!

January 14th, 2012 at 8:45pm

Todd, thanks for these posts on SOPA.

As a software developer and writer I certainly understand the need to protect intellectual property rights, but the means by which we do so has to be balanced against effectiveness, cost, and fundamental human rights.

Kevin has it exactly right: DNS works by mutual consent. Any attempts to enforce changes without that consent risk causing the collapse of the system. Notably, there are *already* browser plugins available that will circumvent the SOPA measures.

SOPA will not be effective, and is likely to cause considerable collateral damage. It's bad legislation from every angle.

January 20th, 2012 at 3:29pm

Thank you for taking the right side in this big problem. I am not from America but I will try to keep my severs in this great hosting unless America turns into a digital hell.

If America accepts SOPA then some of persons, maybe... a few hundreds of persons? ... would get even richer and happy because we would be forced to pay them, right, great! Yes, that is good for ... 0,000001% of the world... and bad for the other 99,99999%. Mmmmm... I think it is not a good deal.

January 22nd, 2012 at 8:18am

thanks for these posts on SOPA.

SOPA won’t work. :)

Leave a Reply

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • You can enable syntax highlighting of source code with the following tags: <pre>, <blockcode>, <bash>, <c>, <cpp>, <drupal5>, <drupal6>, <java>, <javascript>, <php>, <python>, <ruby>. The supported tag styles are: <foo>, [foo].
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.