Tag Archives: collections

Content sharing and ambient display with Electric Objects EO1

Scenic panel El Dorado, designed by Joseph Fuchs, Eugène Ehrmann and Georges Zipélius and manufactured by Zuber & Cie , 1915-25, Gift of Dr. and Mrs. William Collis. From Cooper Hewitt Collection displayed on an EO1. Photo by Zoe Salditch

Scenic panel El Dorado, designed by Joseph Fuchs, Eugène Ehrmann and Georges Zipélius and manufactured by Zuber & Cie , 1915-25, Gift of Dr. and Mrs. William Collis. From Cooper Hewitt Collection displayed on an EO1. Photo by Zoe Salditch

One of the cornerstones of Cooper Hewitt’s very visible digital strategy has been promiscuity. From the first steps in early 2012 when the online collection was released, we’ve partnered with many people from Google Art Project and Artsy to Artstor and now Electric Objects.

Electric Objects is a little different from the others in that we’ve worked with them to share a very select and small number of collection objects, much in the way that Pam Horn and Chad Phillips have worked to grow the museum’s ‘licensed product’ lines of merchandise.

Electric Objects is a New York startup that raised a significant amount of money on Kickstarter to build and ship a ‘system for displaying digital art’. Jake Levine, Zoe Salditch and their team have now developed the EO1 into a small ecosystem of screens deployed in the homes and offices of about 2500 ‘early adopters’ and digital artists who have been creating bespoke commissions for the system.

Cooper Hewitt joined the New York Public Library in providing a selection of collection materials to see what this community might make of it – and, internally, to think about what it might mean to have a future in which digital art might become ‘ambient’ in people’s homes.

I spoke to Jake and Zoe late last week in their office in New York.

Seb Chan – I like how the EO1 has ‘considered limitations’ – the lack of a slideshow mode, the lack of a landscape mode – can you tell us a bit more about what went into these decisions? And now that EO1s are in homes and offices around the world, what the response has been like?

Jake Levine – Computing has for the last 50 to 60 years been characterized by interaction, generally for the sake of productivity or entertainment. Largely as a result, we’ve built software whose basis for success is defined by volume of interaction. Most companies start with: ‘how often can we get users to engage with our product? ‘

What we’ve been left with is a world filled with software competing for our attention, demanding our interaction. And we feel like crap. We feel overwhelmed.

EO1 was an experiment in a kind of computing that, by definition, could not demand anything from us. We asked whether we could build a computer that brought value into its environment without asking for user interaction. How do we ensure that the experiment remains valid? We make interaction impossible. You can’t ‘use’ EO1, just like you can’t ‘use’ art.

In the interest of exploring a different kind of computing, we made sure not to take any existing software paradigms for granted. The slideshow, of course, is ubiquitous in digital photo frames, to which we are often compared. For that decision, we went back to first principles — why? Why do we want slideshows? My experience with slideshows is characterized by distraction. The image changes, it catches my eye, it interrupts my conversation. Change demands our attention.

We say we want slideshows, but how much of that has to do with expectations informed by how screens have behaved in the past, without enough time spent thinking about how they might behave in the future? We’re so accustomed to the speed of the web, that even while we complain about it, when we’re presented with an alternative, we decide that we miss it.

But what is the value of change on the Internet? For me it’s not about randomness, it’s not about timers and playlists and settings. Change at its most meaningful happens in social contexts, in software that lives on top of a network, where ephemerality is actually just conversation, people talking. Twitter, Facebook, Instagram, Tumblr — these services aren’t an overwhelming flood of information, they are people talking to each other, and that’s why we keep coming back.

So you will likely see change enter the Electric Objects experience in the future, but it won’t be programmatic. It will be social.

Electric Objects, like all networked media discovery software, is a shared experience. And that’s also why we lack landscape. It’s important that everyone experiences Electric Objects in the same way, to create a deeper connection among its members. It also makes for a better user experience.

SC – Defaults matter, I think we all learned that from Flickr, and I really like that EO1 is ‘by default’ Public. This obviously limits the use of the EO1 as a digital photo frame, so what sort of things are you seeing as ‘popular’?

JL – People love water! So many subtly moving water images! But beyond the collective fascination with water, a lot of people are displaying the artwork we’re producing for Art Club, our growing collection of new and original art made for EO1 (including the awesome collection of wallpaper from Cooper Hewitt!).

Sidewall, wallpaper with stylized trees, ca 1920, designed by René Crevel and manufactured by C. H. H. Geffroy and distributed by Nancy McClelland, Inc. From Cooper Hewitt Collection displayed on an EO1. Photo by Zoe Salditch.

Sidewall, wallpaper with stylised trees, ca 1920, designed by René Crevel and manufactured by C. H. H. Geffroy and distributed by Nancy McClelland, Inc. Gift of Nancy McClelland. From Cooper Hewitt Collection displayed on an EO1. Photo by Zoe Salditch.

SC – Cooper Hewitt joined the Art Club early on and we’re excited to see a selection of our historic wallpapers available on the device. This wasn’t as straight forward as any of us had expected, though. Can you tell us about the process of getting our ‘digitised wallpapers’ ready and prepared for the EO1?

JL – When you’re bringing any art onto a screen, you have to deal with a fixed aspect ratio. Software designers and engineers know the pain of accommodating varying screen sizes all too well. In many ways what we offer artists — a single aspect ratio across all of our users — is a welcome relief. What’s more challenging is “porting” existing work into the new dimensions.

Wallpapers were actually a great starting point, because they’re designed to be tiled. Still, we hand cropped and tiled each object, to ensure an optimal experience for the user (and the art!).

SC – Our friends at Ghostly and NYPL took a slightly different route. Can you tell us about how both of those collaborators chose and supplied the works that they have made available?

JL – Ghostly is a label that represents a fantastic group of artists and musicians. Together, we selected a few artists to participate in the Ghostly x EO collection, featuring original work made specifically for Electric Objects.

And NYPL was somewhere between Ghostly and what we did with Cooper Hewitt. NYPL has this incredible collection of maps that they’ve digitized. We knew we didn’t want to simply show a cropped version of the maps on EO1, so we turned to the artist community, and starting taking proposals. We asked: what would you do with these beautiful maps as source material?

Natural Elements by Jenny Oddell from the NYPL x EO Collection

Natural Elements by Jenny Oddell from the NYPL x EO Collection

Jenny Odell produced an incredible series of collages. She spent ninety-two hours cutting out the illustrations that cartographers often include on the edges of the maps in photoshop — these beautiful illustrations that rarely get any attention since the maps have a primarily functional purpose. In this case we used something old to make something new, something designed with and for the screen. It was perfect.

SC – Art Club feels like it could be sort of a ‘Bandcamp for net art’. I know you’ve been commissioning specific works for the EO1 and making sure artists get paid, so tell us more about how you see this might work in the future?

Zoe Salditch – Without art, EO1 would just be any other screen. And we’ve known since the early days that art made for EO1 is always a better experience.

There are many ways people engage with and have historically paid for art, so we’re exploring a couple different ideas. Right now, we commission artists upfront and ask them to create small series for EO1, and this collection is available for free for EO1 owners for now. Our plan is to eventually put this ever-growing collection behind a subscription, so that the customer can subscribe to gain access to the entire collection.

Other strategies we’re exploring include limited editions, and a commission service for those who want to have something that feels more exclusive and custom. We believe that artists should be paid for their work, and that people will pay for great art. Other than that, we’re open to experimenting, and we have a lot to learn from our community now that EO1 is out in the wild!

SC – Cooper Hewitt’s wallpapers have been up for a little while as you’ve been shipping out units to Kickstarter backers. What can you tell us about how people have been showing them? What sorts of stats are we looking at?

JL – Art from the Cooper Hewitt collection has been displayed 783 times in homes all over the world, with an aggregate on-display time of over 217 days! The three El Dorado scenic panels have been most popular!

Explore the Cooper Hewitt objects available for ambient viewing through Electric Objects, to visit Shop Cooper Hewitt in-store at 2 East 91st in New York to buy an EO1 unit from the museum tax-free [sorry, not currently available via our online store].

Guest post: Notes from hacking on the Cooper-Hewitt collections API

A couple of days ago the Labs hosted a guest to play with our API.

Over to Frankie to explain what he did and the challenges he faced. As it turns out, there’s a lot you can get done in a day.

Hi, I’m Frankie Roberto. I used to work at the Science Museum in London, where I produced their web projects. I’ve also worked with museums such as the British Museum whilst at digital agency Rattle. One theme running through all of this time is the importance of data, and the things that it can enable.

So when I learnt that the Cooper-Hewitt Museum had released a ‘public alpha’ of their collections database, the idea of spending a day playing with the data whilst in New York (on holiday!) seemed like it’d be fun. Plus, I get to hang out with Seb & co.

I signed up for a an API account ahead of time. This does feel like a bit of hurdle. Because the API uses oAuth 2.0, as well as creating an account, you then have to create an application, and then authorise yourself against your own application in order to get an access token which ultimately grants you access to the data. This makes more sense for situations where you want to get access to another user’s data (e.g. let’s say that users can bookmark favourite objects and you want to display a visualisation of them). For accessing public data it’s a little overkill. Thankfully the web interface makes it all fairly straightforward.

Ideally, I think it’d be simpler and more developer-friendly not to require API keys at all, and instead to simply allow anyone to retrieve the data with a simple GET request. These can even be tried out in a browser – a common convention is to simply add ‘.json’ on the end of URLs for JSON views. This also lets you use HTTP-level caching, which works at the browser end, the server end and proxies in the middle, keeping things speedy. On the downside, this would make it harder to monitor API usage.

Authentication quibbles aside, once set up I could begin querying the data.

I came to the Cooper-Hewitt knowing very little about the institution other than that it is a design museum. My expectations then were that the collection would be a treasure trove of great design from the past century – things like the Henry vacuum cleaner or the Juicy Salif lemon squeezer by Philippe Starck. In short: ‘design classics‘.

‘Classic’ is a funny word, after abused as a euphemism for old and obsolete, but when applied to design I think it implies quality, innovation, and timelessness – things you might still use today (hence the community around maintaining ‘classic cars’).

My challenge then was to see if, for a given type of thing, I could show the ‘classic’ versions of that thing from the Cooper-Hewitt collection.

To kick off, I looked at the list of ‘types’ in the collection. There are 2,998 of these, and they are for the most part simple & recognisable words or short phrases – things like ‘teapot’ and ‘chair’. The data is a little messy, also including more specific things like ‘side chair’ and ‘teapot and lid’, but, y’know, it’s good enough for now.

I could have retrieved the entire list of types through the API, but as you only get a small bunch at a time, this would have required ‘paging’ through the results with multiple requests. Not too tricky, but rather than coding the logic for this, it was a lot simpler to just import the full list from the CSV dump on GitHub.

The next step was to retrieve a list of objects for each type.

Unfortunately, this didn’t actually seem to be possible using the API (yet). So I went back to GitHub and used the CSV dump of all objects. This contains around a 100,000 objects. Not a huge amount, but with a tip-off from Seb, I realised that I was actually only interested in the objects from the ‘product design’ department – a much smaller list of just 19,848 objects (the rest seem to be mainly drawings and textiles).

With these objects imported, the next step was to match the objects with the types.

This data didn’t seem to be in the CSV file – and it isn’t returned in the API response for object details either (an accidental omission, I think). Stuck, I turned to Seb’s team, and soon learned that what I thought was the object ‘name’ was actually a concatenation of the object’s type and age, separated by a comma. So, I could get an object’s type by simply reversing the process (slight gotcha: remember to ignore case).

At this point I had a database of objects by type, but no images – which for most purposes are pretty crucial.

Ideally, links to the images would’ve been in the CSV dump. Instead, I’d have to query the API for each object and collect the links. Objects can have multiple images, but I only really need the main one, which is designated the ‘primary’ image in the API. Oddly, a good proportion of the objects had no primary image, but did have one or more non-primary images. In these cases, I’d just select the first image.

Script written, I started hitting the API. With 19,848 requests to make, I figured this’d take some time. About a quarter of the way through, I realised that the same data was also available in GitHub, and this could be queried by requesting the ‘raw’ version of the URLs (constructed by splitting the object id into bunches of three digits). So I modified my script to do just that, and set it going, this time starting from the bottom of my list of objects and working up. The GitHub-querying script ran a little faster than the Cooper-Hewitt API (probably not too surprising), and so both scripts ‘met’ somewhere in the middle of the list.

The results of this were that I had images for roughly a quarter of the product design objects, with around 5,000. This seems like quite a lot, but given that lots of these are rather obscure things like ‘matchsafes’, the collection actually isn’t that big, and is rather patchy.

There’s a limit to how many products you can actually collect (and store), of course, and so I’m not suggesting that the museum go on an acquiring spree. But I do wonder whether, to present a good experience online, it might be wise to try and merge in some external product design databases to fill in the holes.

By the time I’d assembled all the data, I didn’t have too much time to consider how to present the ‘classic’ products from among the collection.

Ideally, I think this is something that the museum should expose its expertise in. It can be tempting for museums to pretend that all objects have equal value, but in reality there are always some objects that are considered better, more unique, or in this case ‘more classic’ than others. Museum curators are ideally placed to make these judgement calls (and to explain them). For mass-manufactured design objects, this is arguably more important than collecting them in the first place (it’s unlikely you’d not be able to find an original iPod for an exhibition if you needed one).

Ideas we came up with amongst the team were to try and look up the price of the object on eBay (price isn’t a perfect indicator of design value, but might be a reasonable proxy), or to try and see whether other museums, like the V&A, had also collected the same object.

In the end, I went with a simple crowd-sourcing model. Initially three random objects from each type are picked to be shown as the ‘classic’ ones (3 feels like a good number), with the others shown as smaller thumbnails below. You can then very simply vote objects up or down.

The result of this very simple demo is online at http://designclassics.herokuapp.com – feel free to explore (and vote on the objects).

Thanks to the Cooper-Hewitt for hosting me for the day. I look forward to seeing how the ‘alpha’ collections database develops into the ‘beta’, and then the full launch.

If you are an interaction design or digital humanities student, or just a nerd with a bent for playing with museum collections, and you feel like hanging out for a day or two in the Labs to make things then we’d love to have you over.

Drop us a line and we’ll make it happen.

Mia Ridge explores the shape of Cooper-Hewitt collections

Or, “what can you learn about 270,000 records in a week?”

Guest post by Mia Ridge.

I’ve just finished a weeks’ residency at the Cooper-Hewitt, where Seb had asked me to look at ‘the shape of their collection‘.  Before I started a PhD in Digital Humanities I’d spent a lot of time poking around collections databases for various museums, but I didn’t know much about the Cooper-Hewitt’s collections so this was a nice juicy challenge.

What I hoped to do

Museum collections are often accidents of history, the result of the personalities, trends and politics that shaped an institution over its history.  I wanted to go looking for stories, to find things that piqued my curiosity and see where they lead me.  How did the collection grow over time?  What would happen if I visualised materials by date, or object type by country?  Would showing the most and least exhibited objects be interesting?  What relationships could I find between the people listed in the Artist and Makers tables, or between the collections data and the library?  Could I find a pattern in changing sizes of different types of objects over time – which objects get bigger and which get smaller over time?  Which periods have the most colourful or patterned objects?

I was planning to use records from the main collections database, which for large collections usually means some cleaning is required.  Most museum collections management systems date back several decades and there’s often a backlog of un-digitised records that need entering and older records that need enhancing to modern standards.  I thought I’d iterate through stages of cleaning the data, trying it in different visualisations, then going back to clean up more precisely as necessary.

I wanted to get the easy visualisations like timelines and maps out of the way early with tools like IBM’s ManyEyes and Google Fusion Tables so I could start to look for patterns in the who, what, where, when and why of the collections.  I hoped to find combinations of tools and data that would let a visitor go looking for potential stories in the patterns revealed, then dive into the detail to find out what lay behind it or pull back to view it in context of the whole collection.

What I encountered

Well, that was a great plan, but that’s not how it worked in reality.  Overall I spent about a day of my time dealing with the sheer size of the dataset: it’s tricky to load 60 meg worth of 270,000 rows into tools that are limited by the number of rows (Excel), rows/columns (Google Docs) or size of file (Google Refine, ManyEyes), and any search-and-replace cleaning takes a long time.

However, the unexpectedly messy data was the real issue – for whatever reason, the Cooper-Hewitt’s collections records were messier than I expected and I spent most of my time trying to get the data into a workable state.  There were also lots of missing fields, and lots of uncertainty and fuzziness but again, that’s quite common in large collections – sometimes it’s the backlog in research and enhancing records, sometimes an object is unexpectedly complex (e.g. ‘Begun in Kiryu, Japan, finished in France‘) and sometimes it’s just not possible to be certain about when or where an object was from (e.g. ‘Bali? Java? Mexico?’).  On a technical note, some of the fields contained ‘hard returns’ which cause problems when exporting data into different formats.  But the main issue was the variation and inconsistency in data entry standards over time.  For example, sometimes fields contained additional comments – this certainly livened up the Dimensions fields but also made it impossible for a computer to parse them.

In some ways, computers are dumb.  They don’t do common sense, and they get all ‘who moved my cheese’ if things aren’t as they expect them to be.  Let me show you what I mean – here are some of the different ways an object was listed as coming from the USA:

  • U.S.
  • U.S.A
  • U.S.A.
  • USA
  • United States of America
  • United States (case)

We know they all mean exactly the same place, but most computers are completely baffled by variations in punctuation and spacing, let alone acronyms versus full words.  The same inconsistencies were evident when uncertainties were expressed: it might have been interesting to look at the sets of objects that were made in ‘U.S.A. or England’ but there were so many variations like ‘U.S.A./England ?’ and ‘England & U.S.A.’ that it wasn’t feasible in the time I had.  This is what happens when tools encounter messy data when they expect something neat:

Map with mislabelled location and number of records

3 objects from ‘Denmark or Germany’? No! Messy data confuses geocoding software.

Data cleaning for fun and profit

I used Google Refine to clean up the records then upload them to Google Fusion or Google Docs for test visualisations.  Using tools that let me move data between them was the nearest I could get to a workflow that made it easy to tidy records iteratively without being able to tidy the records at source.

Refine is an amazing tool, and I would have struggled to get anywhere without it.  There are some great videos on how to use it at freeyourmetadata.org, but in short, it helps you ‘cluster‘ potentially similar values and update them so they’re all consistent.  The screenshot below shows Refine in action.

Screenshot

Google Refine in action

One issue is that museums tend to use question marks to record when a value is uncertain, but Refine strips out all punctuation, so you have to be careful about preserving the distinction between certain and uncertain records (if that’s what you want).  The suitability of general tools for cultural heritage data is a wider issue – a generic timeline generator doesn’t know what year to map ‘early 17th century’ to so it can be displayed, but date ranges are often present in museum data, and flattening it to 1600 or 1640 or even 1620 is a false level of precision that has the appearance of accuracy.

When were objects collected?

Having lost so much time to data cleaning without resolving all the issues, I eventually threw nuance, detail and accuracy out the window so I could concentrate on the overall shape of the collection. Working from the assumption that object accession numbers reflected the year of accession and probably the year of acquisition, I processed the data to extract just the year, then plotted it as accessions by department and total accessions by year. I don’t know the history of the Cooper Hewitt well enough to understand why certain years have huge peaks, but I can get a sense of the possible stories hidden behind the graph – changes of staff, the effect of World War II?  Why were 1938 and 1969 such important years for the Textiles Department, or 1991 for the Product Design and Decorative Arts Department?

Screenshot

Accessions by Year for all Departments

Or try the interactive version available at ManyEyes.

I also tried visualising the Textiles data as a bubble chart to show the years when lots of objects were collected in a different way:

Screenshot

Accessions for Textiles Department by year

Where are objects from?

I also made a map which shows which countries have been collected from most intensively.  To get this display, I had to remove out any rows that had values that didn’t exactly match the name of just one country, etc, so it doesn’t represent the entire collection. But you can get a sense of the shape of the collection – for example, there’s a strong focus on the US and Western Europe objects.

Screenshot of intensity map

Object sources by country

The interactive version is available at http://bit.ly/Ls572u.

This also demonstrates the impact of the different tools – I’m sure the Cooper-Hewitt has more than 43 objects from the countries (England, Scotland, Wales and Northern Ireland) that make up the United Kingdom but Google’s map has only picked up references to ‘United Kingdom’, effectively masking the geo-political complexities of the region and hiding tens of thousands of records.

Linking Makers to the rest of the web

Using Refine’s Reconciliation tool, I automatically ‘reconciled’ or matched 9000 names in the Makers table to records in Freebase. For example, the Cooper-Hewitt records about Gianni Versace were linked to the Freebase page about him, providing further context for objects related to him.  By linking them to a URL that identifies the subject of a record, those records can now be part of the web, not just on the web.  However, as might be expected with a table that contains a mixture of famous, notable and ordinary people, Refine couldn’t match everything with a high level of certainty so 66453 records are left as an exercise for the reader.

I also had a quick go at graphing the different roles that occurred in the Makers table.

The benefit of hindsight, and thoughts for the future

With hindsight, I would have stuck with a proper database for data manipulation because trying to clean really large datasets with consumer tools is cumbersome. I also would have been less precious about protecting the detail and nuance of the data and been more pragmatic and ruthless about splitting up files into manageable sizes and tidying up inconsistencies and uncertainties from the start.  I possibly should have given up on the big dataset and concentrated on seeing what could be done with the more complete, higher quality records.

The quality of collections data has a profound impact of the value of visualisations and mashups. The collections records would be more usable in future visualisations if they were tidied in the source database.  A tool like Google Refine can help create a list of values to be applied and provide some quick wins for cleaning date and places fields.  Uncertainty in large datasets is often unavoidable, but with some tweaking Refine could also be used to provide suggestions for representing uncertainty more consistently.  I’m biased as crowdsourcing is the subject of my PhD, but asking people who use the collections to suggest corrections to records or help work through the records that can’t be cleaned automatically could help deal with the backlog.  Crowdsourcing could also be used to help match more names from the various People fields to pages on sites like Freebase and Wikipedia.

If this has whetted your appetite and you want to have a play with some of Cooper-Hewitt’s data, check out Collection Data Access & Download.

Finally, a big thank you to the staff of the Cooper-Hewitt for hosting me for a week.

People playing with collections #14: collection data on Many Eyes

Many Eyes Website

Many Eyes Website

I love seeing examples of uses of our collection metadata in the wild. bartdavis has uploaded our data to Many Eyes and created a few visualizations.

I found it interesting to see how many “matchsafes” we have in the collection, as you can easily see in the “color blindness test” inspired bubble chart! Here are a few screen grabs, but check them out for yourself at http://www-958.ibm.com.

Of interest to us, too, is that these visualisations are only possible because we released the collection data as a single dump. If we had, like many museums, only provided an API, this would not have been possible (or at least been much more difficult) to do.

Bubble chart of object types

Bubble chart of object types

Number of objects by century

Number of objects by century

Word cloud of object types

Word cloud of object types

Building the wall

Last month we released our collection data on Github.com. It was a pretty monumental occasion for the museum and we all worked very hard to make it happen. In an attempt to build a small example of what one might do with all of this data, we decided to build a new visualization of our collection in the form of the “Collection Wall Alpha.”

The collection wall, Alpha

The collection wall, Alpha

The idea behind the collection wall was simple enough–create a visual display of the objects in our collection that is fun and interactive. I thought about how we might accomplish this, what it would look like, and how much work it would be to get it done in a short amount of time. I thought about using our own .csv data, I tinkered, and played, and extracted, and extracted, and played some more. I realized quickly that the very data we were about to release required some thought to make it useful in practice. I probably over-thought.

Isotope

Isotope

After a short time, we found this lovely JQuery plugin called Isotope. Designed by David DeSandro, Isotope offers “an exquisite Jquery plugin of magical layouts.” And it does! I quickly realized we should just use this plugin to display a never-ending waterfall of collection objects, each with a thumbnail, and linked back to the records in our online collection database. Sounds easy enough, right?

Getting Isotope to work was pretty straight-forward. You simply create each item you want on the page, and add class identifiers to control how things are sorted and displayed. It has many options, and I picked the ones I thought would make the wall work.

Next I needed a way to reference the data, and I needed to produce the right subset of the data–the objects that actually have images! For this I decided to turn to Amazon’s SimpleDB. SimpleDB is pretty much exactly what it sounds like. It’s a super-simple to implement, scalable, non-relational database which requires no setup, configuration, or maintenance. I figured it would be the ideal place to store the data for this little project.

Once I had the data I was after, I used a tool called RazorSQL to upload the records to our SimpleDB domain. I then downloaded the AWS PHP SDK and used a few basic commands to query the data and populate the collection wall with images and data. Initially things were looking good, but I ran into a few problems. First, the data I was querying was over 16K rows tall. Thats allot of data to store in memory. Fortunately, SimpleDB is already designed with this issue in mind. By default, a call to SimpleDB only returns the first 100 rows ( you can override this up to 2500 rows ). The last element in the returned data is a special token key which you can then use to call the next 100 rows.

Using this in a loop one could easily see how to grab all 16K rows, but that sort of defeats the purpose as it still fills up the memory with the full 16K records. My next thought was to use paging, and essentially grab 100 rows at a time, per page. Isotope offers a pretty nifty “Infinite Scroll” configuration. I thought this would be ideal, allowing viewers to scroll through all 16K images. Once I got the infinite scroll feature to work, I realized that it is an issue once you page down 30 or 40 pages. So, I’m going to have to figure out a way to dump out the buffer, or something along those lines in a future release.

After about a month online, I noticed that SimpleDB charges were starting to add up. I haven’t really been able to figure out why. According to the docs, AWS only charges for “compute hours” which in my thinking should be much less than what I am seeing here. I’ll have to do some more digging on this one so we don’t break the bank!

SimpleDB charges

SimpleDB charges

Another issue I noticed was that we were going to be calling lots of thumbnail images directly from our collection servers. This didn’t seem like such a great idea, so I decided to upload them all to an Amazon S3 bucket. To make sure I got the correct images, I created simple php script that went through the 16K referenced images and automatically downloaded the correct resolution. It also auto-renamed each file to correspond with the record ID. Lastly, I set up an Amazon CloudFront CDN for the bucket, in hopes that this would speed up access to the images for users far and wide.

Overall I think this demonstrates just one possible outcome of our releasing of the collection meta-data. I have plans to add more features such as sorting and filtering in the near future, but it’s a start!

Check out the code after the jump ( a little rough, I know ).

Continue reading