P2P Networks (TCD 4BA2 Project 2002/03)

Intoduction

1. Historical Development

2. Music and P2P

3. Copyright and P2P

4. Napster

image Authors

image Napster Backround

image Napster Architecture and How it Works

image Legal Issues surrounding Napster

image The legacy of Napster

image A Napster Timeline

image Bibliography

5. GNUtella

6. YouServ

7. Freenet

8. P2P Search Engines

9. P2P Routing

10. P2P Security

Readers Guide



Napster

Time magazine
 

Authors:

Susan Crosse crosses@tcd.ie
Elaine Wilson wilsone@tcd.ie
AnneMarie Walsh awalsh2@tcd.ie
David Coen coend@tcd.ie
Charles Smith smithcr@tcd.ie

 

P2P Networks: Napster Background of Napster

The Internet was originally built as a peer-to-peer system in the late 1960s to share computing resources within the US. The first hosts on ARPANET were connected together as equal peers rather than as client-server. The main users at the beginning were computing researchers who did not need protection against each other, and security break-ins were practically non-existent, making the Internet much less partitioned than it is today. Many peer-to-peer systems were widely used and still in existence today such as Usenet and DNS. In 1994 however, the structure of the Internet changed dramatically with millions of people flocking to the Net. Modem connection protocols like SLIP and PPP were widely used and applications were now targeting slow speed analog modems. Applications such as web browsers were based on a client-server protocol. The Structure of the Internet made a switch from peer-to-peer to the client-server model.

However, there are still many applications using the peer-to-peer model, one of the most successful being Napster. It all started when 19-year-old Shawn Fanning was sitting in his dorm room at Northeastern University in Boston. He was listening to his roommate complain about dead MP3 links when he came up with the idea for Napster. "I had this idea that there was a lot of material out there sitting on people's hard drives … and I had to figure out a way to go and get it." He wanted to find an easier way for music listeners to share their favourite recordings with each other. He knew enough about Unix programming to believe such a program was achievable and dedicated all his time and effort to reaching his goal. He had to learn Windows programming as well as Unix server code. The whole concept took him over completely and he decided to drop out of college to finish his work. He moved into his uncle's office and began his work on Napster. Once completed Napster was a huge success and became one of the fastest growing sites in history, reaching the 25 million mark in less than a year in operation.

Before Napster there were online recordings on the Net. Using MP3 compression format music tracks could be transferred onto disk files and then published on a website and people would download them using FTP. The one major problem with this was that up-to-date MP3 files were difficult to find. Napster solved this problem by providing constant up-to-date MP3 files in a single location that everyone knew about. Users could register with the searchable Napster network name space and find files easily through Napster servers, which had information on registered hosts and MP3 data. The servers dealt with the transfer of files between clients but didn't actually store any of the music themselves. Napster's network protocol created direct peer-to-peer access between clients. It is the simplicity of use that peer-to-peer provides that helped with the success of Napster and other applications that use it.

 

Napster Architecture and How it Works

Peer-to-Peer (P2P) is a form of distributed computing that can be described as the sharing of computer resources such as files, MP3s etc and computer services by direct transfer between two computer systems.

Traditionally, the exchange of resources and services between computer systems is done using Client-Server techniques. A Client-Server system is one in which there is a dominant computer (the Server), that is connected to several other computers with less control (the Clients). Clients can communicate with other clients only through the Server. With P2P systems, there is no such dominant server; control is decentralized. Each node or peer on the network may act as both a client and server. Clients in a P2P network can interact freely with other clients without the intervention of a server although sometimes there is the presence of a directory server (which stores IP addresses and other information about the computers in the network) for look up purposes. The diagram below depicts the process of file transfer in a P2P network.

diagram

There are three major types of P2P network:

  1. Pure P2P:
    • peers act as clients and server
    • there is no central server.
    • there is no central router.
  2. Hybrid P2P:
    • Has a central server that keeps information on peers and responds to requests for that information.
    • Peers are responsible for hosting the information as the central server doesn't store files, for letting the central server know what files they want to share and for downloading its shareable resources to peers that request it.
    • Route terminals are used addresses, which are referenced by a set of indices to obtain an absolute address.
  3. Mixed P2P:
    • Has both pure and hybrid characteristics.

Napster is an example of a hybrid P2P system. Napster has a centralised directory (actually several) that describes how files reside in Napster and each host registers with this directory when they join the network. The centralised directories therefore have the IP addresses, the names of the files the hosts want to share and other data stored about each computer system connected to it.

How does Napster work?

diagram

  1. Each user must have Napster software in order to partake in file transfers. The user runs the Napster program. Once executed, this program checks for an Internet connection.
  2. If an Internet connection is detected, another connection between the user's computer and one of Napster's Central Servers will be established. This connection is made possible by the Napster file-sharing software.
  3. The Napster Central Server keeps a directory of all client computers connected to it and stores information on them as described above.
  4. If a user wants a certain file, they place a request to the Napster Centralised Server that it's connected to.
  5. The Napster Server looks up its directory to see if it has any matches for the user's request.
  6. The Server then sends the user a list of all that matches (if any) it as found including the corresponding, IP address, user name, file size, ping number, bit rate etc.
  7. The user chooses the file it wishes to download from the list of matches and tries to establish a direct connection with the computer upon which the desired file resides. It tries to make this connection by sending a message to the client computer indicating their own IP address and the file name they want to download from the client.
  8. If a connection is made, the client computer where the desired file resides is now considered the host. The host now transfers the file to the user.
  9. The host computer breaks the connection with the user computer when downloading is complete.

How does Napster search for and trade files? Client Server Protocol

Napster searches for files using the client/server protocol. The client/server protocol uses TCP/IP communication protocol. The server maintains a 'master list' of all computers connected to it and it searches for files by looking up its directory as described in some of the steps above. Client/Client Protocol

Napster allows file transfer that is independent of the server. It occurs between two client computers. There are four transfer modes: upload, download, firewall upload and firewall download.

 

Legal Issues surrounding Napster

After it's creation in May 1999 by founder Shawn Fanning, many record companies realised that the threat Napster posed to its potential earnings was immense and hence worth pursuing in a legal manner. This court case involved the Recording Industry Association of America (RIAA), which includes such music industry giants as AOL Time Warner's Warner Music, BMG, EMI and Sony Music among others, suing Napster over breach of copyright law, saying that it had illegally founded a business based on the use of copyrighted material that they did not have privilege to distribute. They believed that the whole file-sharing movement would cost the music industry millions in falling sales and unpaid royalties, while Napster argued that it was merely providing a service and it was the users who were doing the distributing. The following is an account of the legal proceedings involving both parties, which started just seven months after Napster was founded.

Initially, the RIAA had contacted Napster several times requesting (including in writing) the removal of copyrighted material from their system and only after they claimed to have received no response did they decide to file a lawsuit seeking an injunction. Contrary to this, Napster claimed to have responded (in writing), seeking to arrange some sort of settlement, which would have suited both parties. At this point, they also added a disclaimer on their system, saying that playing, transmitting, copying and creating any music related files may be impinging on federal law and it was up to the user whether they were to adhere to this or not.

In December 1999, the RIAA filed their lawsuit at the Ninth US District Court in Northern California (in San Francisco) against Napster, saying they broke both federal and state laws through "contributory and vicarious copyright infringement" and that they wished to have the service shut down permanently. Obviously, providing a list of music files that existed available for anyone with Internet access to view and use is hardly breaching copyright law - this lawsuit meant the court would have to decide the closeness of Napster's relationship with the users in facilitating such practices. The reason that these companies were targeting Napster (and why they felt they had a case) was the fact that Napster used a third party directory service to facilitate its users, while clones such as Gnutella merely set-up a direct P2P link between two hosts.

On July 26 2000, Judge Patel of San Francisco ordered an injunction on Napster to halt its service, siding with the record companies, while still awarding Napster the chance to appeal. Napster argued that personal copying of music is protected by federal law under the Audio Home Recording Act of 1992, while the judge declared that computers were not classified as home recording devices. They claimed that the injunction would put them out of business - in contrast, record companies said Napster was costing them over $300 million/year in lost sales. They also (somewhat cheekily) claimed that it did not know that copyright infringement was taking place (i.e. "all this MP3 sharing is nothing to do with us!") - Judge Patel was not impressed!

Two days later, Napster successfully appealed the injunction and were allowed to remain in operation, at least for the time being. The case now needed a full trial and it was up to the RIAA to pursue it, either as far as the Ninth US District Court or all the way to the US Supreme Court. They were very upset at the success of the appeal but expected the higher courts to rule in their favour once it looked at the case in detail. In response to their success, jubilant Napster spokespeople urged their users to continue what they called their 'Buycott', i.e. continue to buy published music by artists that supported free trade of digital music.

In early October of the same year, both parties were ordered to present each of their arguments in front of the District Court. At this point, the record companies realised that a complete shutdown of Napster was unlikely and hence requested that their copyrighted material be removed/blocked from the system and if not that Napster would be charged with copyright infringement. On the other hand, Napster were hoping to come to an arrangement with the music industry by hoping to establish a fee paying service, where users paid a nominal monthly fee, most of which would be forwarded to the record companies who owned the copyright for the downloaded material. This idea was not embraced by the RIAA, who still wished to have the matter settled in the courts.

However, it is worth mentioning that by the end of this month, BMG (a member of the RIAA) announced that it was to work with Napster on a similar system to that suggested and in the end withdrew from the legal proceedings. This became a reality at the end of January 2001.

Despite what was happening with BMG, the other companies were still pursuing their case and seemed to have obtained what the desired at a ruling on February 12, 2001. The judge ruled that companies had to provide Napster with details (i.e. title, artist name, names of files on Napster relating to these works, certification that the companies control rights were infringed) of the files they wanted removed from the system and it was Napster's responsibility to carry it out. Both parties also had to undertake reasonable measures to remove files with name variations of such files, in order to prevent users using e.g. shortened or misspelled file names. Once this was done, Napster would have 72 hours to remove the files from their system.

By March 28, there was still conflict between those involved. On one side, Napster were saying that they had removed over 250,000 unique songs with over 1.6 million different filenames, while on the other hand, record companies claimed to have given them 8 million file names, covering over 600,000 different works. The two main problems were with misspelled filenames, which could not easily be related to a particular piece of material and the fact that many of the song names given to Napster came with no associated filename.

Eventually, and somewhat ironically, Napster ended up coming to an agreement with AOL Time Warner and EMI as well as several independent record labels in Britain and Europe, similar to the existing one with BMG, where they would receive payment for files downloaded. Having being ordered to remain offline until all copyrighted material had been removed, Napster spiraled into bankruptcy, and with a court ruling blocking the sale of Napster to Bertellsman AG, the last hopes of reviving the service disappeared.

 

The legacy of Napster: Online warfare on P2P networks

Ever since the court ruling in July 2000 ordering the shutting down of Napster's file swapping service the question has been raised as to how to enforce such a decision, not just against Napster but against the plethora of similar services which have appeared in the wake of the company's bankruptcy. Here we explore how music and film labels since Napster are fighting file swapping with "hacking" techniques

Systems such as Gnutella do not have any central directory server as Napster did. To deal with these, something is needed which attacks the overall structure of P2P networks. The new techniques being considered are technologically based, and in many cases music labels are considering attacking these services using "hacking techniques".

During the Napster legal battle, the RIAA (Recording Industry Association of America) included among the defendants several universities whose students were hosting file servers using the university network. Many of these subsequently agreed to block all traffic relating to Napster in order to avoid further legal proceedings. Similarly, some ISPs have started to discourage or ban their subscribers from serving files from their machines. [http://news.dmusic.com/print/5683] As part of the Digital Millennium Copyright Act [http://lcweb.loc.gov/copyright/legislation/dmca.pdf], ISPs can be forced to identify suspected infringers of copyright. This process is inevitably slow, being complicated by legal wrangling, and anti-piracy techniques are now being used that do not require court intervention.

A method routinely employed at present is known as "spoofing", and involves flooding repositories of copyrighted material with fake, corrupt or otherwise undesirable files. The objective here is that users will find the services extremely unreliable and hence be discouraged from simply browsing and downloading files since there would be a high probability of getting a file containing nothing but 3 minutes of silence, for example. Artists such as Eminem have used this immediately preceding the release of certain recent albums, and several P2P networks are reporting co-ordinated spoofing attacks on their networks from sources using narrow bands of IP addresses, presumably corporate networks.

But while these types of attacks may be irritating for P2P network users, many networks are adapting to these techniques, and the forthcoming version of Morpheus will contain a rating system whereby users can rate the authenticity of files, thereby allowing others to avoid downloading a file which has been marked as bogus. This is leading the music industry to consider other approaches to fighting piracy, including those normally associated with hackers, such as denial-of -service attacks and domain name hijacking.

A denial-of-service attack involves bombarding a site or machine on the net with requests such that it becomes overloaded and unable to serve real users. Multiple very slow downloads can be started which effectively clog the entire bandwidth of a target machine. Domain name hijacking is generally accomplished by changing DNS information so that traffic destined for a particular site gets redirected to a location chosen by the hijacker. As well as impairing P2P sites, the traffic could be analysed, potentially leading to more targets for attack and effectively "infiltrating" the system. A more extreme example is the suggestion that purpose designed worms and viruses could be used to attack and disable machines running P2P software.

The important difference between this form of "hacking" and the approaches mentioned earlier is that hacking of any form is, at present, illegal. "At present" is the important phrase here, because a proposed bill written by US Congressman Howard Berman would provide a degree of legal immunity to copyright holders hacking in pursuit of copyright violations. The exact techniques, which would be permitted, is to be kept secret and only revealed to the US Attorney General, however intentional deletion of files would be forbidden. The new rules would make it quite difficult for those whose systems have been damaged by such an attack to claim any form of compensation, since they would require the Attorney General's permission before filing any lawsuit, and they could only claim if the cost of the damage was greater than $250.

Whilst some may argue that it is appropriate to allow these organisations to use the tools they require to protect their business, the bill is seen by others as going very much too far, and being a license to distribute viruses or generally cause whatever type of mayhem Hollywood sees fit to impair P2P. Berman says that:

"Congress should free copyright creators and owners to develop and deploy technological tools for addressing P2P piracy" [http://www.house.gov/berman/p2p062502.html]

From a legal standpoint it would seem that copyright holders have the right to attempt to disable or block P2P networks, however their options are presently limited by what is conceived as being malicious use of computers. Providing them with an all purpose get-out-of-jail-free card may certainly help to deter those who would flaunt copyright law, however there is a grave risk that they could cause accidental damage to a completely innocent Internet user who would have few rights as concerns compensation. Steve Griffin of StreamCast networks says: "Even law enforcement officers are not above the law; copyright owners certainly should not be" [http://news.com.com/2010-1078-942330.html]

It is easy to forget that there are many legitimate uses for P2P systems, such that giving someone permission to sabotage them at will could seriously impact the development of what has undoubtedly shown itself to be a useful technology. Clumsy, ill-conceived and damaging hacking attacks could potentially be carried out with little way to extract any form of recompense from those responsible. It is in essence setting the stage for a high-tech battle between some of the worlds largest companies and a vast file-sharing community, who make more than 3 billion P2P downloads per month.

What direction will be taken from here is uncertain. Berman's legislation was accepted by the US congress in July [http://www.house.gov/berman/pr072502.htm], and he has since come under heavy attack in the media, from the Electronic Frontier Foundation [http://www.eff.org/IP/P2P/20020802_eff_berman_p2p_bill.html] and others [http://www.wired.com/news/politics/0,1283,54153,00.html], despite his claims that the bill does not allow for unrestrained interference with P2P networks [http://www.house.gov/berman/remarks092602.htm]. At this stage it is difficult to judge the effects of the new laws, however it is clear that the recording industry's victories against Napster in the courts are only the beginning of a major conflict which will take place, most likely not in the courtroom this time, but online.

 

A Napster Timeline

 

Bibliography

http://www.personal.psu.edu/users/j/i/jid102/timeline.html

http://personal.ansir.com/wms/fanning.htm

http://www.oreilly.com/catalog/peertopeer/chapter/ch01.html

http://compnetworking.about.com/library/weekly/aa062500a.htm

http://rr.sans.org/policy/post_napster.php

http://webopedia.internet.com/ComputerScience/ClientServerComputing/peertopeerarchitecture.html

www.howstuffworks.com

http://wiki.cs.uiuc.edu/cs427

http://www.cnn.com/SPECIALS/2001/napster

http://napster.findlaw.com/

http://www.idg.net/english/crd_napster_497766.html

http://www.idg.net/idgns/2001/04/02/NapsterTimeline.shtml

http://grammy.aol.com/features/0130_naptimeline.html