I propose that the World Wide Web would serve well as a framework for structuring much of the academic Computer Science curriculum. A study of the theory and practice of the Web’s technologies would traverse many key areas of our discipline. Furthermore, there is a natural way to structure such a traversal to support a course of study stretching over many semesters.
(One side-effect of such an approach is that graduates would enter the workforce with significant exposure to a suite of technologies that would likely be valuable in their professional careers. But that’s not the argument I’m making here.)
In the following layout, the course numbering implies a first guess at levels and prerequisites. Since I’ve never designed a curriculum before and my own undergraduate study is decades in the past, it’s quite likely that my suggestions as to the number of courses and semester levels need improvement.
Web 101: Architectural Introduction · Core idea: Teach what happens when you click on a link and a page with some multimedia and personalization is rendered for you.
To be covered: What a browser and a server are. What HTML, CSS, HTTP, and URIs are. What CGI and application servers are. An initial look at the underlying architecture.
Sample exercises: Deploy a web server. Write some simple HTML/CSS pages with a text editor. Write CGI scripts in a high-level language to produce personalized pages.
Web 210/310/410: Data Structures and Algorithms · Core idea: The classic center of the CS curriculum.
To be covered: All the usual stuff. The Web dimension comes from the use of a Document Object Model implementation as a teaching aid.
Sample exercises: Incrementally build more and more of a DOM implementation, probably in a low-level language like C or Java. A wide variety of important data structures and algorithms can be put to good use in such an implementation.
Web 220: Principles of Networking · Core idea: Classic survey of networking basics.
To be covered: This is a straightforward computer-networking course, making a conventional traversal of the stack.
Sample exercises: Implementation of a small lightweight Web server that accepts connections and serves requests for files from the filesystem.
Web 230: User Interfaces · Core idea: Survey UI issues, but assuming the browser as a delivery platform, to put the focus on the content not the technology.
To be covered: Principles of usability, user experience design, brief introduction to typography and design issues, evaluation techniques. On the implementation side, MVC.
Sample exercises: Build (in groups) a Web user interface to a score-keeping system for some sport, evaluate its usability, improve it.
Web 320: Application Protocols · Core idea: Distributed applications are based on various flavors of message-exchange pattern. This is a survey of the options.
To be covered: Streaming protocols, session-oriented protocols, HTTP-style “message exchange” protocols.
Sample exercises: Implement selected portions of WebDAV, XMPP, and AtomPub.
Web 340/440: Persistence · Core idea: Study the issues surrounding persistent storage of data.
To be covered: Filesystems. Byte- and record-oriented storage. Relational databases. Document databases and distributed hash tables.
Sample exercises: Build some of the pieces of an application like Twitter or Facebook, implementing efficient storage and retrieval of different kinds of information resources.
Web 350: Parsing · Core idea: Study methods for extracting structure from textual inputs.
To be covered: The usual, including grammars, automata, and other parsing strategies. Text representation, especially Unicode.
Sample exercises: Build parsers for HTTP headers, then for JSON, then for XML, then for ECMAScript.
Web 351: Data Format Design · Core idea: Study the techniques and trade-offs that arise in designing data formats.
To be covered: Media data formats (JPG, PNG, video), Binary data formats (ASN.1, protocol buffers), textual data formats (JSON, XML). Compression.
Sample exercises: Implement a PNG reader, design (in a group) a non-trivial XML language.
Web 360: Security · Core idea: Study techniques for implementing computer programs assuming the presence of large numbers of technically-skilled malicious users.
To be covered: Introduction to digital signature and encryption technologies. Threat analysis. Social engineering. Viruses, worms, botnets, and other villains.
Sample exercises: Implement a brute-force attack on an obsolete encryption technology. Write a recommendation on how some well-known successful criminal exploits could have been averted.
Web 470: Issues in Large Distributed Systems · Core idea: Study problems that arise in systems much too large for a single computer.
To be covered: REST, Naming objects (DNS, URIs, other naming schemes). Caching. Partitioning workloads. Web crawling. Functional programming. Error handling.
Sample exercises: Implement something that uses the DNS. Learn the ideas behind Erlang and Map/Reduce. Write a small-scale Web crawler.
Web 480: Concurrent Systems · Core idea: Study the options for making use of modern highly-concurrent computer platforms.
To be covered: Threads, processes, actors, message-passing.
Sample exercises: Implement message-passing infrastructure in a functionally-flavored language (Scala, Erlang) and in a conventional procedural one (Java, Ruby).
What’s Missing? · The Web Curriculum doesn’t cover all of CS. Here is a (doubtless non-exhaustive) list, in no particular order, of important subjects where a Web-centric academic approach would not obviously be of help:
Computer graphics.
Object-orientation.
Real-time software.
Number representation and arithmetic.
Artificial intelligence.
Computer system architecture.
Analysis of algorithms.
Comment feed for ongoing:
From: Jesse Robbins (Jul 15 2009, at 11:16)
Tim,
A few months ago I explored some of this on the O'Reilly Radar. I think what matters is that an engineer understands Web as a system, and can see the whole while examining the parts. I think your curriculum should start there.
In my post I said that my favorite interview question to ask candidates is: "What happens when you type www.(amazon|google|yahoo).com in your browser and press return?"
While the actual process of serving and rendering a page takes seconds to complete, describing it in real detail can take an hour. A good answer spans every part of the Internet from the client browser & operating system, DNS, through the network, to load balancers, servers, services, storage, down to the operating system & hardware, and all the way back again to the browser. It requires an understanding of TCP/IP, HTTP, & SSL deep enough to describe how connections are managed, how load-balancers work, and how certificates are exchanged and validated... and that's just the first request!
While people often specialize on particular components, great engineers always think of that component in relation to the whole. The best engineers are able to fly to the 50,000 foot view and see the entire system in motion and then zoom in to microscopic levels and examine the tiny movements of an individual part.
(link: http://radar.oreilly.com/2009/05/velocity-conference-big-ideas.html )
-Jesse Robbins
[link]
From: Michael Hanson (Jul 15 2009, at 11:30)
Many computer science curricula have added software engineering courses to the required undergraduate program.
Without getting engaged in too much of the CS vs. SE debate, I think most observers would agree that some exposure to the basic ideas of software engineering should be part of any practical CS degree program.
Suggested topics: unit, functional, and integration testing; principles of source code management; API design and documentation; code maintainability; bug analysis and triage; release management. There should be a quarter- or semester-long group project that requires multiple releases and issue tracking.
A moderately complex web application, with some functional layers to it, would be a good fit with the "Web Curriculum" concept.
[link]
From: JulesLt (Jul 15 2009, at 11:30)
Have you seen Opera's Web Curriculum - a stab in the same direction.
http://www.opera.com/company/education/curriculum/
[link]
From: Jamie Macey (Jul 15 2009, at 11:30)
I think algorithms should be easy enough to cover. There's lots of opportunities to have students build a small app with a bit of backend processing, and cover efficiency, various big-O notation, etc.
[link]
From: PJ (Jul 15 2009, at 11:39)
Where do compilers fit in?
Object Orientation could be viewed as a subset of data structures (code is data, right?)
Real-time software could somewhat be via a webserver design: Require max of XXms latency under all circumstances. Or maybe as an offshoot of Data Format Design to figure out how to handle or stream media data formats.
Numeric representation and arithmetic could be part of data structures.
AI could start with expert systems like recommendation engines and work through neural nets and bayesian filters and such.
Architecture could fit in with the Real-Time bits above - you'll need to understand them all to figure that one out fully.
Analysis of Algorithms could be part of
Data Format Design, maybe the 200-level version.
[link]
From: Sérgio Nunes (Jul 15 2009, at 12:17)
Any recommendation on a book (or books) covering all these topics? It is hard to find good books on this (fast moving) area.
[link]
From: Mark (Jul 15 2009, at 13:01)
http://www.moserware.com/2009/06/first-few-milliseconds-of-https.html rocked my world.
[link]
From: Justin Watt (Jul 15 2009, at 13:30)
Looking forward to one day sending the kids off to the Tim Bray School of Internet Science...
[link]
From: Tony Fisk (Jul 15 2009, at 17:06)
Database management is missing from the non-web list.
Other practical issues:
- training language? (or are we concentrating on raw html?)
- asynchronous communications (ie AJAX)
- frameworks
- agnostic comparisons of said languages/frameworks
- Web 2 and all that, plus what might constitute Web 3?
- A bit low level, as it deals with efficient routing management, but OpenFlow would be a good one to include as well.
As the question on what happens when you click a url demonstrates, the web is nearly as deep as it is broad. I can envisage streams (eg social as well as technical aspects)
[link]
From: Anton (Jul 15 2009, at 19:44)
A few other thoughts: Web site/page performance. Steve Souders High Performance Web Sites is a great reference. Web Standards and cross browser development, including designing for mobility. Working in teams/agile development.
[link]
From: JulesLt (Jul 15 2009, at 23:07)
Coming back to this - I really liked the structure of my own course, which went right down as far as 'how to make a logic gate out of transistors' and built up from that to designing a register, and a basic micro-code controller - in retrospect, a basic Turing machine.
This was at the same time as doing 'fun' stuff in high level languages, rather than having to build a computer first, but by the end of the 3 years I had a sketchy knowledge of how a computer worked from the electronics to microcode to assembler to compiler (and operating system) and all the way up.
Luckily for me, there wasn't a whole lot more 'up' from there. We were still in the era when GUI and OO were novel enough to be optional 3rd year modules - rather than widespread.
But I think I came out with a good basic set of skills I was able to build on, and most of which have come in handy since, even if the lowest level I've worked at since has been the occasional bit of C.
In contrast, I found some of my graduate peers who'd been on courses oriented very much towards commercial development lacked that understanding - it's the old saw about abstraction - time-saving right up until the abstraction gets in the way of what is really happening - and if you never understood what is really happening . . . .
I do like the idea of using the Web as a model for constructing a course though (my course was clearly structured around the model of a multi-user mainframe system, for instance).
Things I'd add in :
Interpreter/Runtime design and theory (driven from JavaScript or subset of?).
Comparative programming languages - ensuring developers have a grasp of functional, prototype vs class style OO, explicit vs implicit type systems, etc, and the trade-offs involved (with an emphasis on getting young minds to understand there is always a trade-off, rather than a right/wrong decision).
Messaging/Queuing systems - bit of future proofing - the most likely change in the web model will be a shift from polling to 'real' push technology.
[link]
From: Paul Morriss (Jul 16 2009, at 01:17)
I like your basic premise. My most satisfying Maths course at University was the one where they showed why you couldn't square the circle. Relating something deep to something familiar provides a lot of interest.
[link]
From: Oliver Mason (Jul 16 2009, at 02:58)
AI could be covered with a discussion of 'chatterbots'; that would also touch on Natural Language Processing.
[link]
From: Tkil (Jul 16 2009, at 04:07)
An earlier comment touched on this, but I'd like to expand on an idea: your posted curriculum is a great plan of how to teach people so that they know how to implement things ... once they know what they are implementing.
My university career was probably a bit more eclectic than was typical for the early '90s, but it only gave me the tools of the trade. Learning to what end to use those tools, however, was a different story.
The best education I ever had was sitting in a cubicle with the users I was supporting, and simply watching them work for a day or five. Without that knowledge of what they needed (or wanted), it doesn't matter how well I can implement systems.
It doesn't always flow in this direction, of course; there will always be brilliant and lucky visionaries that see what people want before those people themselves know -- but 99% of us are simple implementers, and we'll be much more successful if we learn how to distill and fulfill user requirements.
Would you have been able to inform and shape the XML movement without your background in real world applications?
[link]
From: Stephen Leather (Jul 16 2009, at 06:28)
Tim,
Thank you for the birds eye view. I agree with Jesse about the systemic view of the Web. Nuclear power technicians have to answer similar questions throughout their careers. I don't work in the field anymore, but I always loved working on that (still do, especially with a white board).
I would also take Michaels suggestions further in two directions. First, students should be required to use (and be consistently graded on) good practices *from the beginning of their academic careers*, such as source code control, etc. This helps them develop good habits. Second, also *from the beginning*, students should be working on large projects as part of a team of students from all levels.
[link]
From: RE (Jul 16 2009, at 08:37)
Um, should you have some maths in there?
Another thing I'm really curious about: what is your motive for coming up with this, apparently out of the blue?
To me, your proposal feels far more like a web-orientated SE programme than a CS one -- not necessarily a bad thing, of course.
[link]
From: Pete Berry (Jul 16 2009, at 11:58)
An interesting idea Tim - and one that could lead to an interesting course.
The big thing I missed though is people. The large space that includes user interaction; requirements capture; what used to be called Systems Analysis; design in all it's forms; team working.
I'd say that even the most techi-centric, propeller-heads-only course would benefit from including some or all of these.
Topics in this area can be challenging, interesting and rewarding. Jakob Nielsen's work is one exemplar and incidentally the Open U here in the UK has a cut-down version of the kind of thing you may have in mind in its Certificate in Web Applications Development - see http://www3.open.ac.uk/courses/bin/p12.dll?Q01C39
[link]
From: Mark Nottingham (Jul 16 2009, at 18:28)
I think the key here is de-emphasising programming as a topic on its own. Programming is like writing or drawing; it's a necessary skill that does need development, but isn't an end unto itself.
I've met so many engineers in my career who can talk until they're blue in the face about object orientation or other programming skills, but as soon as you get them out of the box of a single address space, they're totally flummoxed.
This is why I often find that people who come from an OPS / sysadmin background have a much more intuitive understanding of how networked systems work, the challenges of management and deployment, etc.
[link]
From: SeanHn (Jul 17 2009, at 05:27)
Really interesting idea, but if you're going to call it Computer *Science* then there is going to have to be at least one mandatory mathematical course a year (and preferably one a semester), covering statistics, logic, advanced calculus, set theory etc.
The content and basic skills taught in those courses are mandatory for anyone going to call themselves a scientist IMO. It might have just been my undergrad, but generally people that sucked at maths sucked at the practical side as well, with only one or two exceptions.
[link]
From: Nicola Larosa (Jul 17 2009, at 05:40)
> Other practical issues:
>
> - training language?
Why, Python obviously. ;-)
[link]
From: Noah Mendelsohn (Jul 17 2009, at 14:58)
I think your "what's missing" list is in the ballpark, but if appropriately expanded with detail I think that "missing" list covers a healthy chunk of CS including all of hardware and systems design from the transistor level on up, including memory hierarhies, instruction set design, paging, I/O subsystems, etc.
Probably also missing is a thorough look at programming language design and tradeoffs, operating system structures (so far covered only obliquely in concurrent systems).
Furthermore, while not exactly missing, the focus on the Web tends to de-emphasize some issues that have traditionally been of great importance including transaction processing techniques (two phase commit, logging, checkpointing, etc.) While these aren't sexy on the Web, they still tend to be back there when we use the Web to move money in our bank accounts or book a plane ticket. Likewise, data management tends to involve a lot of issues at the "back end" that are only indirectly visible when viewed through the prism of the Web.
Similarly, I think the Web-based approach could lead to an imbalanced view of things like protocols. Presumably TCP/IP and HTTP would get a lot of emphasis, but what about the 25 years of work on RPC and its optimization, including at a protocol level? What about deep dives into the tradeoffs of distributed object systems and their associated protocols. What about looking at Linda/TSpaces/Jini? Will those things come up with appropriate emphasis in this curriculum.
I think this is an interesting exercise, but it seems to introduce some bias into the curriculum that is, to my eye, ultimately unfortunate. I do think it's an interesting motivational framework for the bits that it does hit. Ultimately, I think a good CS curriculum should try to look at things from multiple angles, not just one. If the end result is to convince students that the Web is ultimately a good archicture, so be it, but they should get there by surveying a broad range of other work and analysing the tradeoffs.
[link]
From: Paul W. Homer (Jul 20 2009, at 14:47)
I tend to see things in terms of technologies.
In that sense, the web is just the intersection of computer science with several different types of technology. Interfaces, persistence, protocols and distribution are probably the main categories. It's a top-down perspective starting with the user's direct interaction and gradually getting deeper.
Most new technologies are really just reinventions of older ones, so it's far better to understand the patterns, which stay the same, then the specifics, which often change.
Architecture and data-structures are two keys to building new technologies. Once you know where the existing pieces fit, building new ones is the next challenge.
Paul.
[link]