People are claiming that a new British law is going to allow anyone to steal your online pictures and sell them and keep the money. I think they’re mostly wrong about that law, but in the process of checking it out I ran across some bad behavior by social-media companies.
OMG they’re stealing my pretties! · Someone linked, with a gasp of horror, to UK.Gov passes Instagram Act: All your pics belong to everyone now by Andrew Orlowski. I was prepared to blow it off because Orlowski is generally wrong about everything. This is the man who, back in 2004, referred to Wikipedians as “Khmer Rouge in nappies” and has continued to get attention with lurid Internet contrarianism; which has also worked for Jaron Lanier, Andrew Keen, and lately Evgeny Morozov. The Net is important enough that it needs sensible pushback, but we can do better than these guys; I miss the days when Cliff Stoll was our best-known naysayer.
In this case it’s not just Andrew; the British Journal of Photography is less alarmist in Controversial copyright framework receives Royal Assent, but they’re still upset.
Um, maybe not · I haven’t read the British legislation, but what it apparently does is turn people loose to re-use “orphan works”; those for which the creator’s and/or rights-holders’ identities can’t be established. I think this is sensible enough; the tricky bit is in identifying the orphans.
Orlowski says “the user only needs to perform a ‘diligent search’” to establish orphan-hood. So, what’s the problem? The things I publish online, as for example here, tend to appear on pages which clearly assert they’re by Tim Bray and that certain rights are reserved. So you wouldn’t have to be very diligent at all to establish parenthood, at least in my case.
But then I read, in the BJP piece, “a large number of online services, such as Facebook, Twitter and Flickr, strip the metadata from uploaded images, creating millions of new orphan works each day” and I thought, that can’t be right. But it is, partly.
Side trip: On Exif · (Those of you who know about it can skip to the next section.)
It turns out that electronic photograph files contain not just the pixels that form the image, but also textual fields containing “metadata”, information about the picture. This is generally referred to as Exif, and it identifies some or all of: the camera, lens, date, location (if there’s a GPS), size, aperture, and lots of other arcane photographic details. Plus, crucially, the name of the creator.
Exif is super-useful but also sort of a disorganized mess; there’s poor compatibility between cameras and photo-editing packages. For example, there are at least three fields you can store authorship in: Artist, By-line and Creator.
There are plenty of tools that let you peek behind the pixels at the Exif; most photo-editing packages will do this, and those of us who like the old-school command-line approach use ExifTool by Phil Harvey.
Most serious photographers arrange that when they publish an electronic photo, the Exif data includes their name. And getting back to the legal discussion, a “diligent search” to determine who owns a picture would obviously include checking out the Exif.
Metadata amputation · I decided to check and see whether the BJP was right. So I took the picture above, made sure it had my name in the Artist, By-line, and Creator fields, and posted it to Twitter using the Web interface. Then I downloaded the picture and checked the Exif, and sure enough Twitter had nuked it. There were 245 lines of Exif info going in, 58 coming out, and none of them included my name.
In fact, Twitter clearly states “We remove the Exif data upon upload. It is not available to those who view your photo on Twitter.”
Now, I don’t think Twitter is evil. And I suspect there are some fields where this makes all sorts of sense. Lots of cameras and phones put GPS data into pictures, without telling you, and I think it’s probably sensible to keep from sharing your location with the entire world by default.
But I think removing the photo’s attribution is a serious mistake and Twitter should fix it.
Also, it’s not just Twitter; kora foto morgana pointed me at the Embedded Metadata Manifesto site, which has done the digging and published Social Media sites: photo metadata test results. If they’re accurate, they reveal that Facebook, Flickr, Photobucket, and Twitter are losing attribution. On the other hand, DropBox, Google+, Pinterest, and Tumblr are doing the right thing.
Oops, me too · I checked the pictures right here on the blog and, uh, blush... It turns out that the original image (what you see if you click on my picture) retains the attribution, but the reduced drop-shadowed version just above lost it somewhere along the pipeline through ImageMagick and then also Framer, which I wrote.
Practically speaking it’s not a problem, because anyone who wants to use one of my pictures will want to start with the larger version. But this does show that it’s easy to get this wrong. Doesn’t mean that Twitter should, because some of the competition doesn’t, and they’re supposed to be pros.
Summing up · So yeah, Orlowski was wrong as usual; “diligent search” seems to me an entirely reasonable way to determine whether any digital artifact is an orphan or not, and even with missing Exif, the amount of diligence to required to figure out if a picture in a Twitter stream is an orphan or not isn’t that onerous. And unleashing the digital orphans into the public commons is a good thing.
On the other hand, certain well-known social sites are engaging in what feels to me like egregiously abusive behavior in stripping authorship information from works whose publishing they facilitate.
Please fix that up, Twitter and Flickr and Facebook. Or we might start suspecting your motives.
Comment feed for ongoing:
From: orcmid (Apr 29 2013, at 14:43)
A problem that struck me immediately in the account of the British law, which is yet to be implemented in regulations, is that anyone can intentionally orphan a work by simply removing all trace of its provenance.
So willful infringement becomes innocent use of an orphaned work. Terrific.
[link]
From: Josiah Sprague (Apr 29 2013, at 14:57)
I remember when Facebook didn't strip EXIF data from photos, and there was some drama about people using location data for bad things.
A few weeks ago, I noticed that some pictures I had uploaded to Facebook contained descriptions that came from meta data I had set via my Canon asking people not to use photos without permission.
I think that is a good step for Facebook, but perhaps they should use all of the metadata, and let the user select which metadata is posted with the photos.
[link]
From: Chris Vannoy (Apr 29 2013, at 15:06)
Actually, I strongly suspect the reason most sites (at least large ones) strip EXIF has to do with bandwidth and file sizes.
EXIF can add a significant amount of overhead to an image, for instance in a prior photo application, (and where I first ran into this), I couldn't get thumbnail - 50x50 - file sizes down. It was maddening.
Pages of thumbnails loaded the images very slowly because each thumb was bloated with EXIF data.
Once I stripped them out, everything was nice and speedy.
When you add that to the storage and bandwidth costs involved, it makes financial and user experience sense to strip the data out.
That said, EXIF should *always* be retained for full-res images. Those are not accessed as frequently as sized images and the bandwidth concerns are greatly lessened.
[link]
From: Heather (Apr 29 2013, at 16:00)
I'd suggest a good reason why Twitter & Flickr (for example) strip the creator tag, but Google+ doesn't, is the fact that they support fake names. People don't wan't to accidentally expose their real name on a site where they use a different name.
If I share a photo on Flickr, then I want it attributed to my flickr name (and it clearly is, on that site), not my real name.
[link]
From: Kevin H (Apr 29 2013, at 16:45)
As mentioned, Exif is "sort of a disorganized mess". So as a default behavior, I think it is good that online services err on the side of caution and just strip Exif completely rather than mistakenly leaking information that the person posting the image did not mean to leak.
I think an advanced user may be interested in having an option to retain (or create if it doesn't exist) certain Exif fields, but I think the average user is well served by the current default.
[link]
From: Karon (Apr 29 2013, at 17:31)
Flickr doesn't strip exif data from iPhone photos. The act of uploading from the phone does. If I save an iPhone photo to my hard drive and then upload to Flickr it retains all the data including location. Both emailing an iPhone photo to Flickr and using the Flickr app to upload from the phone do strip exif data. This points to it being a size issue around the transfer. Want to keep the data, upload from a computer. Yes, I realize this is not at all helpful where breaking news photos are concerned.
[link]
From: Tom Magliery (Apr 29 2013, at 18:22)
Flickr does not remove EXIF data from your original size photos, but it strips everything from all of the other sizes that it generates and stores.
If someone finds my photos on my Flickr pages, it's as much of a no-brainer as if someone finds your photos on your blog: attribution is evident.
The problem is what happens when someone copies my/your photos elsewhere and represents them as their own. EXIF copyright information gives you a little bit of protection, but just as a determined thief will break your car window if you lock your doors, so will a determined thief almost certainly know how to get rid of your EXIF data.
[link]
From: IBBoard (Apr 30 2013, at 01:04)
I agree that the idea is fine. Copyright is meant to be a temporary exclusive right based on the condition that society benefits from the work when it expires. If there is no owner (a company went bust and no-one bought it, the owner has died and not bequeathed it to someone, or the owner has become a recluse and given up all material concerns) then why not free it up?
I also agree that the idea is horribly flawed, though. Not that it should be much of a surprise for any modern law - politicians aren't known for doing things well in the "use vs abuse" department.
I have a Flickr account (and I've paid for Pro). I'm only an amateur, but I put all of my photos on there. When they're on the page then they're attributed to me, and I put them under a Creative Commons license. I didn't think to put any additional EXIF data in for ownership because that's a defacto right, and they're already somewhere that shows the license.
A diligent search *should* (in theory, if every single Flickr image is indexed by whatever method is considered suitably diligent) find my images. That search *should* also see that my image is the oldest, even if someone else has copied it and put it somewhere that strips metadata.
But what if I drop my Flickr Pro subscription? All of my oldest photos will disappear, anyone who has copied them before they vanish will appear to be the original source, and if they're just on some attribution-less site (like Twitter, ImageShack and the rest of them) then it could easily be classed as an orphan by anyone who wants to.
Or what if Flickr isn't automatically indexed? I'd read that there were specific locations being mentioned as registries of works. How many people regularly add their works to such registries?
And finally, what about the Internet? This is obviously where it is going to be hit the most. We're now going to be claiming fair game on any "orphaned" work from any country! That seems like we're stretching our powers a little.
Given the flagrant disregard for copyright that a lot of people show ("I just want an image for X and it was in Google image search, so it is fine to use") then it isn't a big stretch for images to be found in all sorts of places and then taken from there.
Now, how do I go back through my Flickr photos and add copyright and other related EXIF data?
[link]
From: Steve Walker (Apr 30 2013, at 01:10)
Tim, you suggest that Flickr strips out EXIF data, and, at least for me, it doesn't. Under "Actions" for my photos, there's an option to view EXIF data, and the comment I put in EXIF with my name a phone number remains there. So Flickr is honourable in this respect.
[link]
From: Seb (Apr 30 2013, at 07:22)
But surely there is also a problem. If someone downloads all of your pictures from your website, and stores them in a folder with thousands of other websites, using random names for the files. With one of the images that is stripped from exif (the drop shaddowed photo above for example), they found it and wanted to use it. How would they perform a 'diligent' search?
Replace local folder with online photo repository and you have a recipe for madness!
[link]
From: -b- (Apr 30 2013, at 07:58)
Of that mess of metadata, IPTC is where the copyright is supposed to be.
[link]
From: Richard Mobey (Apr 30 2013, at 09:26)
"Since March of 2010, I’ve been an employee of Google."
That would be the Google that benefits hugely from free content would it ?
[link]
From: Greggman (Apr 30 2013, at 09:36)
What defines "diligent search"? Can I take an image of Luke Skywalker fighting Darth Vader in Return of the Jedi and do a search "2 guys in black clothes with glowing swords". Such a search on Google Image Search doesn't find any pictures from Star Wars at least when I did it. Was my search diligent?
Google has image search by image. I've often searched for images of movies I'd like to know the name of and not had a match. Can I now use the image?
The problem with the new law is it removes incentive to be lawful. If the work really is orphaned then there's no problem because no one will sue. If work is not orphaned but you just couldn't figure that out (or claim you couldn't figure that out) you can now get away scott free. Just use any pictures to your heart's content, if a complain comes claim you did a diligent search and have no repercussions. That sounds like a law designed to exploit people. Not a law designed to allow use of orphaned works (no such law is needed by definition)
[link]
From: Fiona Campbell (Apr 30 2013, at 09:38)
'Stealing my pretties' read 'stealing my livelihood' (I'm a photographer). This bill (perhaps it might be an idea if you read it) will only result in good images being removed from the internet.
Check
http://www.stop43.org.uk/pages/news_and_resources_files/photographers_have_just_been_royally.php
[link]
From: David Riecks (Apr 30 2013, at 10:25)
Tom wrote: "Flickr does not remove EXIF data from your original size photos, but it strips everything from all of the other sizes that it generates and stores." Several others note that they can see the Exif information.
However if you download the original image from a flickr page, you'll find that this is not the case -- unless you have PAID for a PRO flickr account. The free accounts strip the information from all versions, regardless. Even the Pro accounts do not retain the Exif (or other embedded metadata) in any of the derivatives created. Perhaps a reason to pay for a pro account, but if so, I'd want the embedded info to stick to all the derivatives as well as the original.
Tests on more services, as well as details, can be seen by clicking on the "preliminary results" link on the Social Media test page (http://www.controlledvocabulary.com/socialmedia) -- this is the source data on which the IPTC based their survey results. Check to make sure you have the most recent result, as several services have changed how they deal with metadata since we began gathering reports in late 2009.
As Tom points out, the bigger issue occurs when a person downloads a file from a social media network which does not preserve metadata. When this happens the resulting file is, by definition, and orphan work (unless the person downloading the image has the foresight to keep records on where they got the file).
[link]
From: Kevin Marks (Apr 30 2013, at 10:41)
You work for a diligent search company. How about if I drag a photo into google search to find similar, does it do its best to show provenance and EXIF data from the copies it finds? Currently it only shows context from the web pages its embedded in.
[link]
From: Jonathan Webb (Apr 30 2013, at 14:07)
The line from the UK govenment is that the "diligent search" will stop businesses helping themselves to everybody's photos. Unfortunately what they have in mind for a diligent search is not a search like you or I would do,at best its simply a bunch of librarians asking another bunch of librarians if they know who this picture belongs to. There is no attempt to search using image recognition technology like google image search. At worst they might not even search at all but simply post on an obscure journal or website that they have found an image and can the owner come forward.
The EU has published Diligent Search Criteria here : http://ec.europa.eu/information_society/activities/digital_libraries/doc/hleg/orphan/guidelines.pdf
The act is clearly designed to help big publishers get their hands on free content. Under the new act, if somebody is involved in a tragedy or drama , news media will be able to help themselves to all the facebook images of the victims / participants. Thats big money and apparently more important than the rights of the photographers or the people in the photographs
[link]
From: Joseph Ford (May 01 2013, at 04:50)
The point that is being (deliberately) missed by the politicians is that metadata can be stripped not only by Flickr et al, but also by anyone who happens to find a picture they want to use for commercial gain.
The process goes like this:
X finds a picture
X strips the metadata
X can now say that there's no way of finding who created the image
X now uses image legally as an orphan work.
Then add the following permutation:
Y comes across the picture X has used. By removing the metadata, X made this picture into an 'orphan work'. Y now genuinely has no way of finding out who created the image.
Y now uses image legally.
etc etc ad infinitum.
[link]
From: Jonathan Webb (May 01 2013, at 08:24)
The line from the UK govenment is that the "diligent search" will stop businesses helping themselves to everybody's photos. Unfortunately what they have in mind for a diligent search is not a search like you or I would do,at best its simply a bunch of librarians asking another bunch of librarians if they know who this picture belongs to. There is no attempt to search using image recognition technology like google image search. At worst they might not even search at all but simply post on an obscure journal or website that they have found an image and can the owner come forward.
The EU has published Diligent Search Criteria here : http://ec.europa.eu/information_society/activities/digital_libraries/doc/hleg/orphan/guidelines.pdf
The act is clearly designed to help big publishers get their hands on free content. Under the new act, if somebody is involved in a tragedy or drama , news media will be able to help themselves to all the facebook images of the victims / participants. Thats big money and apparently more important than the rights of the photographers or the people in the photographs
[link]
From: Dan Guy (May 01 2013, at 09:13)
I am all for getting orphan works back into circulation. I am surprised that it should start with photographs and not, say, out of copyright books.
I understand the photographers concern. It seems to me that the better solution is to attack (and upgrade) what constitutes a diligent search, rather than the concept of allowing for the use of orphan works.
Elsewhere I read a suggestion that "diligent search" should include an online image search like that provided by Google Images. I'm not sure how you mandate it without tying it to a specific company, but that seems sensible. Then, photographers merely have to post their images with attribution somewhere crawlable online, be it flickr or their own sites.
[link]
From: len (May 01 2013, at 09:56)
"Orlowski was wrong as usual... - tbray"
"hiding information from the web is suspicious - tbray":
So
http://money.cnn.com/gallery/technology/security/2013/05/01/shodan-most-dangerous-internet-searches/index.html
and there is that nifty keen Excel feature that is on by default and searches Bing when one clicks on a cell, so if one is using Excel on a project, one might as well be broadcasting.
Stupid is as stupid does. The web was fielded witlessly. Happy anniversary.
[link]
From: alan herrell (May 06 2013, at 10:10)
Everyone of the social sites strips exif data. Part of this behavior is explained by file size reduction, which you grant by using these sites.
The kicker comes with the grants that everyone of these sites have in their TOS.
Facebook does it.
“2.3 For content that is covered by intellectual property rights (like photos and videos), you specifically give us the following permission, subject to your privacy and application settings: you grant us a non-exclusive, transferable, sub-licensable, royalty-free, worldwide license to use, copy, publicly perform or display, distribute, modify, translate, and create derivative works of (“use”) any content you post on or in connection with Facebook. This license ends when you delete your content or your account.”
Instagram does it:
“you hereby grant to Instagram a non-exclusive, fully paid and royalty-free, worldwide, limited license to use, modify, delete from, add to, publicly perform, publicly display, reproduce and translate such Content, including without limitation distributing part or all of the Site in any media formats through any media channels, except Content not shared publicly (“private”) will not be distributed outside the Instagram Services.”
So as soon as they are stripped of exif data, renamed, there is a strong and compelling argument as becoming not only a derivative work, which could transfer copyright to the sites for the newly created image, but also an orphan work which makes them ripe for cherry picking.
Whether this comes to pass is up in the air, but stranger stuff has happened.
[link]
From: Sigfrid Lundberg (Jun 03 2013, at 04:26)
I downloaded one of my images from Flickr in "original resolution" to check what it contained. Everything was there, like focal length, camera etc. Artist and Creater wasn't, but I suppose that is my own fault (I could have added it). There was even a digiKam revision history section in XML.
The Gimp, which I use occasionally, strips just about everything. Even creation dates. Which is just as bad for retrieval purposes. There's a lot to do in this area. ImageMagick actually dies if a tiff contains a colour profile!
Sigfrid
[link]