Category Archives: Collection data

Random Button Television

AppleTV SDK Fun

Recently, Apple announced a number of updates to their product line, including a pretty major update to AppleTV, the small set top box that allows people to listen to their music collection, rent movies and TV shows, and stream audio from their phones to their televisions. The biggest update to AppleTV was definitely the fact that it now supports “Apps”, allowing iOS developers to design and build whatever they can imagine.

I put my hat in the ring, and applied for Apple’s lottery and about a week later a brand new AppleTV Software Development Kit showed up at the Labs.

To be honest, I haven’t had much interest in developing apps for a long time now. It’s problematic at best to go down the road of building something for iOS ( or any brand specific device ) in the context of a museum, and yes, it’s been a long while since I even glanced at Objective-C. But, the device is a curious object, and at the very least made me wonder what it might be like to introduce a way to open up access to our collections through the warm and inviting glow of a television screen. Imagine it for a moment, sitting there atop your dresser, or mounted to your living room wall, next to the fire place, in full HD. The television, no matter how you divide it up over the years has a pretty permanently fixed position in our homes, and in our minds.

As a little side project, I decided to see what I could do with the AppleTV SDK and our Collections API. I decided ahead of time I wouldn’t spend too much time on this, and although I wound up spending at least one night reading up on NSDictionary and a few other oddball data-types in Objective-C, I was able to stick with my original plan and quickly built a little “Hello AppleTV” app that simply allows the “viewer” to flip through objects in our collection by pressing the “select” button on their remotes.

It uses one API method, our old favorite cooperhewitt.objects.getRandom, and yeah, that’s all it does. Keep pressing the select button and you continue to get objects. It’s quite fun!

AppleTV XCode

So here’s how it works. As we say over here at the Cooper Hewitt Labs, “Working code always wins.”

  1. There is a ViewController. This is the thing that represents the screen on your AppleTV, and the thing you can apply all of the subsequent properties to.
  2. There is an ImageView. This is where the image is applied to.
  3. There are a couple Labels which simply allow you to display the title and object ID for each object.
  4. There is a asynchronous way of calling the API.

Here’s the code. There’s basically just the two files for the ViewController that define everything we’re talking about. Beyond that, there is some “wiring up” of the ImageView and Labels so the visuals know what code they are connected to, and there is really the one method “fetchRandom” that does the work of calling the API, parsing the response, and storing the things we are interested in.

And here is the end result.

To be quite honest, we probably won’t be uploading this to the iTunes store. It’s really just a “Hello World” app and only meant to be a conversation starter for staff members who happen by the Labs area. But it does make me wonder — what else could museums do with a device like this?

The device itself is a curious one, with plenty of built in human interface challenges and opportunities. Sitting back, clicking the remote and checking out collection objects, isn’t really my idea of an exciting way to spend an evening at home, but take this a few steps further, maybe a few additional API calls, and who knows what might unfold.

Long live RSS

Screen Shot 2015-07-10 at 2.35.17 PM

I just made a new Tumblr. It’s called “Recently Digitized Design.” It took me all of five minutes. I hope this blog post will take me all of ten.

But it’s actually kinda cool, and here’s why. Cooper Hewitt is in the midst of mass digitization project where we will have digitized our entire collection of over 215K objects by mid to late next year. Wow! 215K objects. That’s impressive, especially when you consider that probably 5000 of those are buttons!

What’s more is that we now have a pretty decent “pipeline” up and running. This means that as objects are being digitized and added to our collections management system, they are automatically winding up on our collections website after winding their way through a pretty hefty series of processing tasks.

Over on the West Coast, Aaron, felt the need to make a little RSS feed for these “recently digitized” so we could all easily watch the new things come in. RSS, which stands for “Rich Site Summary”, has been around forever, and many have said that it is now a dead technology.

Lately I’ve been really interested in the idea of Microservices. I guess I never really thought of it this way, but an RSS or ATOM feed is kind of a microservice. Here’s a highlight from “Building Microservices by Sam Newman” that explains this idea in more detail.

Another approach is to try to use HTTP as a way of propagating events. ATOM is a REST-compliant specification that defines semantics ( among other things ) for publishing feeds of resources. Many client libraries exist that allow us to create and consume these feeds. So our customer service could just publish an event to such a feed when our customer service changes. Our consumers just poll the feed, looking for changes.

Taking this a bit further, I’ve been reading this blog post, which explains how one might turn around and publish RSS feeds through an existing API. It’s an interesting concept, and I can see us making use of it for something just like Recently Digitized Design. It sort of brings us back to the question of how we publish our content on the web in general.

In the case of Recently Digitized Design the RSS feed is our little microservice that any client can poll. We then use IFTTT as the client, and Tumblr as the output where we are publishing the new data every day. 

RSS certainly lives up to its nickname ( Really Simple Syndication ), offering a really simple way to serve up new data, and that to me makes it a useful thing for making quick and dirty prototypes like this one. It’s not a streaming API or a fancy push notification service, but it gets the job done, and if you log in to your Tumblr Dashboard, please feel free to follow it. You’ll be presented with 10-20 newly photographed objects from our collection each day.


So this happened:

Print The Exhibition – The Label Book Generator

As a Peter A. Krueger intern this summer, I am working in both the Digital and Emerging Media and Cross-Platform Publishing Departments at the Cooper Hewitt. Since I am traversing the two departments, a project that allows me to learn from each and create something that benefits both is of course ideal. The Label Book Generator does this in a twofold manner: It allows me the opportunity to learn and write code to develop a digital product, which in turn, serves to produce a physical publication of interpretive content for an exhibition.


Label Book Generator–’How Posters Work’ exhibition page

Currently a prototype, The Label Book Generator is a tool that creates a printed publication of object labels for each exhibition at the Cooper Hewitt. In its most basic use, Visitor Services at the museum can navigate to an exhibition from a list on the website’s homepage and once on an exhibition page, press Command-P (or File > Print) to generate a PDF with an initial cover page followed by a single label on each page–all entirely set in a larger font-size.

What initially prompted the development of this prototype was to solve readability issues visitors may have with existing wall labels. This does not imply that the current label design needs to change or be set in a larger font-size, but instead that the labeling system as a whole should be augmented with something to make them more accessible, to provide a magnifying glass of sorts when needed.

photo (1)

Publication in use in the gallery

The entire process proved to be invaluable as a learning experience. From the start it was obvious that I needed to leverage the museum’s API to access object data by exhibition to ultimately populate fields in each label. As the Label Book Generator website is currently, the selection and order of the fields are in accordance with a predefined template that begins to apply the typographic guidelines of the existing wall labels. As a graphic designer it was particularly interesting for me to consider the meticulous planning that is usually involved in typesetting parallel to the time spent writing the code. Whereas typically, these two processes are dealt with in succession.

Since the end result needed to be a book, I was set on formatting the data in a markdown document that would have typographic styles manually applied in InDesign. A Python script was written to create a markdown document with syntax assigned to each field, e.g., titles would be prepended with ‘#’ to be a top level header, dates with ‘##’ to be a second level, etc.

Stumbling along with my rudimentary skills in Python–and at one point rewriting the whole thing in Javascript, only to go back to Python–led me to conclude that outputting the final document with InDesign can be circumvented. With the much-appreciated help of Micah Walter, it was settled that rather than generating a markdown file, I should instead produce a small web application using Python and Flask as a framework. The most salient aspect of the entire project now being a simple print style sheet for the website that automatically generates the same final document that having to manually use InDesign would have produced (Here is the code available on Github).

With a central concern for typography, the print style sheet seamlessly flows all the content into any fixed page format, which in this case would be a printable PDF. The printed document once bound can be considered an exhibition catalog reduced to its essential elements: A list of every work, with their respective information and descriptions (when available).


Interior spread of printed publication

The Label Book Generator solves the initial prompt of assisting those hard of seeing. However, considering that the website from the get-go is built with a responsive layout and scalable typography (again due to the simultaneity of graphic design and web development) there are a number of opportunities to expand it’s role and purpose.

The typography, padding and margins set in REMs (Root EM), rather than fixed sizes, allows for the ability to control the base size and relatively adjust the measurements. A future version of the website can include in the interface a means to control how large or small the base size of the document should be, given the dimensions of the fixed format–whether it be a standard letter-sized PDF, or otherwise.


Browser print dialog box


Cover page of printed publication generated from the website


Interior spread of printed publication

When presenting the prototype to others here at the museum Katie Shelly brought up an interesting future use case involving blind visitors and screen reader software. In addition to the possibilities with printable versions of the Label Book Generator, the website itself provides a responsive mobile view of all the labels which could theoretically be read to the visitors via their personal device.


Mobile view of website

Finally, the printed label book serves as a means to visualize the collection database. If a label in the book and website is missing a field, it reflects an oversight at the ‘source of truth’. In other words, there is a one-to-one relationship between the fields in both the labels and the database. Ultimately, this brings to mind the commonplace workflow of producing wall labels that are manually written, designed, and edited (on this topic see also: Label Whisperer). In perhaps a later version, a similar process of using the museum’s API to automate the process of generating the label book, could theoretically be applied to the entire production of wall labels for the museum.

Missing tags for the object on recto

Missing tags for ‘Amerika’

Give the Generator a go!

Object concordances – what is the simplest thing to match like with like?


Do you notice anything special about this screenshot of Charles Eames’ famous No. 670 Chair?

It might be hard to see because it’s a tall screenshot and this is a small thumbnail. Have a look at the large version. Hint: It’s not the part where the chair is missing in the picture. It’s actually this, on the right-hand side of the object details:


Object concordances! With other museums! To the same objects in their collections!! On their own websites !!!

Before you get too excited (and think its actual working ‘Linked Data’), we should point out that as of this writing we have only “concordified” four distinct objects – this one, this one, this one and that one – eight times with four separate organizations, one of which is our own shop, so there is a lot of work left to do.

Screen Shot 2015-05-28 at 6.57.20 PM

If you look carefully you can see that most of the concordances, to date, were added within about 90 minutes of one another. That’s because Seb and I were talking about object concordances over lunch that day and agreed that we could probably push the simplest and dumbest thing out the door before I went home. It has been something that has been on the agenda since mid-2012.

Specifically, we maintain a fixed list of institutions with whom we will “concordify” objects. If your institution isn’t on that list yet it’s not personal. We can add as many institutions as we want but we think the narrow focus helps to explain the purpose of the tool. Then we simply record that institutions unique ID, the object ID for something in our collection and the object ID for something in their collection. That’s it.

Screen Shot 2015-05-28 at 6.10.49 PM

Currently the tools for adding concordances, or editing institutions, are … terrible.

(Or rather, they are the unadorned plumbing that makes the whole thing work. So they are beautiful and elegant in their own way but most people would be forgiven for not seeing those qualities right away.)

Short-term the goal is to build some friendlier “admin” web page for a few more people to add concordances without having to worry about the technical details. Medium-term the goal is to create restricted API methods for doing fancy-pants buttons and pop-up dialogs on the object pages themselves to allow staff to add concordances as they think of them or are otherwise just poking around the collections website. Maybe in the long term, ‘the crowd’ might be invited to do it too.

Screen Shot 2015-05-28 at 6.11.02 PM

Somewhere between those two things we will also build proper “index” pages on the collections website of all the objects that have been concordified, all the institutions that have concordified objects and so on. Just like we’ve already done for people.

The other thing we’ll do shortly is make sure that these concordances are included in the CC0 Cooper Hewitt collections metadata dump which is available on GitHub.

When we said “the simplest thing” we meant it.

There isn’t much yet but it’s a start – a tangible proof of what it could be – and if we’ve done our job right then it is one of those things that will grow exponentially, as always, as time and circumstance permit.

(If you’ve been a long time reader you might remember we did Rijkscolors back in 2013 as an experiment in automatically matching objects – but we were undone by language and structural differences in metadata, and the reality that humans might still be better at this at least until the sector irons a few things out)

Collect all the things – shoeboxes, shop items and the Pen

Screen Shot 2015-05-27 at 6.15.22 PM

You can now collect any object in the collection, or on display, from the collections website itself. Just like in the galleries there is a small “collect” icon on the top right-hand side of every object page on the collections website. It’s not just individual object pages but also all the object list pages, too. So many “collect” icons!


  Objects that haven’t been collected yet have a grey icon.

  Objects that have been collected in the galleries, as part of a visit to the museum, have a pink icon.

  Objects that that have been collected on the collections website have an orange icon.

Simply click the grey icon to collect an object or click one of the orange or pink icons to remove or un-collect that object.

That’s it!

Screen Shot 2015-05-27 at 6.14.55 PM

Just like visit items, things you collect on the website have a permanent URL that can be made public to share with other people and can be given a bespoke title or description. Objects that you collect on the collections website live in something we’re calling the “shoebox”.

You can get your to shoebox by visiting or if you’re already logged in to your Cooper Hewitt account by visiting

There is also a handy link in the Your stuff menu, located at the top-left of every page on the collections website.

The shoebox is the set of all the objects you’ve collected (or created) on the website or during your visits to the museum. Although visits and visit items overlap with things in your shoebox we still treat them differently because although you need to be logged in to you Cooper Hewitt account to add things to your shoebox a visit to the museum can be entirely anonymous if a visitor so chooses.

The default view for the shoebox is to display everything together in reverse-chronological order but you can filter the view to show only things collected online or things collected during a visit. You can also see the set of all the objects you’ve made public or private.


logged out view (large version)


logged in view (large version)

But it’s not just objects, either. You can already collect videos during your museum visit so those are included too. Ultimately the only limit to what you might collect with the Pen is time-and-typing. Things we’re thinking about making collect-able include: entire exhibitions or the introductory texts on the wall for an exhibition or people or individual rooms in the Mansion.

Museum retail

We’ve started this process by allowing you to collect things in the museum Shop.

By “things in the Shop” we mean all the things that have ever been sold in the Shop over the years. And by “all the things” we mean almost all the things. There is some technical hoop-jumping related to inventory management systems and that is why we don’t have everything yet but we’ll get there in time.

We are a captial-D design museum with a capital-D design shop and many of the things that have been available in the Shop have gone on to become part of our permanent collection so it only makes sense to give them a home on the collections website. In fact MoMA already does similarly with their “find related products in the MoMA Store” feature though ours is a bit different.


You can see for yourself at

The /shop section is divided in two parts: Brands and Items (and all the items for a given brand of course). There isn’t a whole lot of extra information beyond titles and links to the SHOP Cooper Hewitt website for those items that are currently in-stock but it’s a start. Like the rest of the collections website we’ve started with the idea that providing permanent stable URLs that people can have confidence we create something that can be improved on over time.


Shop items and brands don’t get updated as regularly as we’d like yet. We are still working through the fiddly details of bridging our systems with the Shop’s ecommerce and POS system and some things still need to be done by hand. We’ve been able to get this far though so we expect things will only get better.


You might be wondering…

You might be reading this and starting to wonder Hmmm… does that mean I can also collect things in the Shop as I walk around the museum with the Pen? the answer is… Yes!

As of this writing there are only one or two items that can be collected with the Pen because the Shop staff are still getting familiar with the tools and thinking about how making collect-able labels changes in their day-to-day workflow. The obvious future of this might be the infamous ‘wedding register’, however we believe that many museum visitors actually would like to bookmark objects to possibly buy later, or just remember as part of their overall visit to the ‘museum campus’.

Practically what that has meant are some changes to Sam‘s “tag writer” application (the subject of a future blog post) to fetch shop items via our API and then letting the Shop folks decide what they want to tag and when they want to do it.

There has been a whole lot of change here over the course of the last three years and allowing the various parts of the museum warm up to the possibilities that the Pen starts to afford at their own pace and with not only a minimum of fuss but plenty of wiggle-room for experimentation is really important.

In the meantime we hope that you enjoy collecting at least more, if not all, of the things that make up the museum.

Sorting, Synonyms and a Pretty Pony

We’ve been undergoing a massive rapid-capture digitization project here at the Cooper Hewitt, which means every day brings us pictures of things that probably haven’t been seen for a very, very long time.

As an initial way to view all these new images of objects, I added “date last photographed” to our search index and allowed it to be sorted by on the search results page.

That’s when I found this.

Figure Of A Pony (Germany), ca. 1930. glazed earthenware. Gift of Victor Wiener. 2000-47-20.

I hope we can all agree that this pony is adorable and that if there is anything else like it in our collection, it needs to be seen right now. I started browsing around the other recently photographed objects and began to notice more animal figurines:

Rooster Figure, 20th century. porcelain. Gift of J. Lionberger Davis. 1968-1-26.

Figure (China). porcelain. Gift of Mr. and Mrs. Ernest du Pont. 1980-55-2.

As serendipitous as it was that I came across this wonderful collection-within-a-collection by browsing through recently-photographed objects, what if someone is specifically looking for this group? The whole process shows off some of the work we did last summer switching our search backend over to Elasticsearch (which I recently presented at Museums and the Web). We wanted to make it easier to add new things so we could provide users (and ourselves) with as many “ways in” to the collection as possible, as it’s those entry points that allow for more emergent groupings to be uncovered. This is great for somebody who is casually spending time scrolling through pictures, but a user who wants to browse is different from a user who wants to search. Once we uncover a connected group of objects, what can we do to make it easier to find in the future?

Enter synonyms. Synonyms, as you might have guessed, are a text analysis technique we can use in our search engine to relate words together. In our case, I wanted to relate a bunch of animal names to the word “animal,” so that anyone searching for terms like “animals” or “animal figurines” would see all these great little friends. Like this bear.

Figure, 1989. porcelain, enameled and gilded decoration. 1990-111-1.

The actual rule (generated with the help of Wikipedia’s list of animal names) is this:

 "animal => aardvark, albatross, alligator, alpaca, ant, anteater, antelope, ape, armadillo, baboon, badger, barracuda, bat, bear, beaver, bee, bird, bison, boar, butterfly, camel, capybara, caribou, cassowary, cat, kitten, caterpillar, calf, bull, cheetah, chicken, rooster, chimpanzee, chinchilla, chough, clam, cobra, cockroach, cod, cormorant, coyote, puppy, crab, crocodile, crow, curlew, deer, dinosaur, dog, puppy, salmon, dolphin, donkey, dotterel, dove, dragonfly, duck, poultry, dugong, dunlin, eagle, echidna, eel, elephant, seal, elk, emu, falcon, ferret, finch, fish, flamingo, fly, fox, frog, gaur, gazelle, gerbil, panda, giraffe, gnat, goat, sheep, goose, poultry, goldfish, gorilla, blackback, goshawk, grasshopper, grouse, guanaco, fowl, poultry, guinea, pig, gull, hamster, hare, hawk, goshawk, sparrowhawk, hedgehog, heron, herring, hippopotamus, hornet, swarm, horse, foal, filly, mare, pig, human, hummingbird, hyena, ibex, ibis, jackal, jaguar, jellyfish, planula, polyp, scyphozoa, kangaroo, kingfisher, koala, dragon, kookabura, kouprey, kudu, lapwing, lark, lemur, leopard, lion, llama, lobster, locust, loris, louse, lyrebird, magpie, mallard, manatee, mandrill, mantis, marten, meerkat, mink, mongoose, monkey, moose, venison, mouse, mosquito, mule, narwhal, newt, nightingale, octopus, okapi, opossum, oryx, ostrich, otter, owl, oyster, parrot, panda, partridge, peafowl, poultry, pelican, penguin, pheasant, pigeon, bear, pony, porcupine, porpoise, quail, quelea, quetzal, rabbit, raccoon, rat, raven, deer, panda, reindeer, rhinoceros, salamander, salmon, sandpiper, sardine, scorpion, lion, sea urchin, seahorse, shark, sheep, hoggett, shrew, skunk, snail, escargot, snake, sparrow, spider, spoonbill, squid, calamari, squirrel, starling, stingray, stinkbug, stork, swallow, swan, tapir, tarsier, termite, tiger, toad, trout, poultry, turtle, vulture, wallaby, walrus, wasp, buffalo, carabeef, weasel, whale, wildcat, wolf, wolverine, wombat, woodcock, woodpecker, worm, wren, yak, zebra"

Where every word to the right of the => automatically gets added to a search for a word to the left.

Not only does our new search stack provide us with a useful way to discover emergent relationships, but it makes it easy for us to “seal them in,” allowing multiple types of user to get the most from our collections site.

How re-opening the museum enhanced our online collection: new views, new API methods

At the backend of our museum’s new interactive experiences lies our API, which is responsible for providing the frontend with all the data necessary to flesh out the experience. From everyday information like an object’s title to more novel features such as tags, videos and people relationships, the API gathers and organizes everything that you see on our digital tables before it gets displayed.

In order to meet the needs of the experiences designed for us by Local Projects on our interactive tables, we added a lot of new data to the API. Some of it was sitting there and we just had to go find it, other aspects we had to generate anew.

Either way, this marks a huge step towards a more complete and meaningful representation of our collection on the internet.

Today, we’re happy to announce that all of this newly-gathered data is live on our website and is also publicly available over the API (head to the API methods documentation to see more about that if you’re interested in playing with it programmatically).


For the Hewitt Sisters Collect exhibition, Local Projects designed a front-end experience for the multitouch tables that highlights the early donors to the museum’s collection and how they were connected to each other. Our in-house “TMS liaison”, Sara Rubinow, worked to gather and structure this information before adding it to TMS, our collection management system, as “constituent associations”. From there I extracted the structured data to add to our website.

We created a the following new views on the web frontend to house this data:

We also added a few new biography-related fields: portraits or photographs of Hewitt Sisters people and two new biographies, one 75 words and the other 50 characters. These changes are viewable on applicable people pages (e.g. Eleanor Garnier Hewitt) and the search results page.

The overall effect of this is to make more use of this ‘people-related’ data, and to encourage the further expansion of it over time. We can already imagine a future where other interfaces examining and revealing the network of relationships behind the people in our collection are easily explored.

Object Locations and Things On Display

Some of the more difficult tasks in updating our backend to meet the new requirements related to dealing with objects no longer being static – but moving on and off display. As far as the website was concerned, it was a luxury in our three years of renovation that objects weren’t moving around a whole lot because it meant we didn’t have to prioritize the writing of code to handle their movement.

But now that we are open we need to better distinguish those objects in storage from those that are on display. More importantly, if it is on display, we also need to say which exhibition, and which room it is on display.

Object locations have a lot of moving parts in TMS, and I won’t get into the specifics here. In brief, object movements from location to location are stored chronologically in a database. The “movement” is its own row that references where it moved and why it moved there. By appropriately querying this history we can say what objects have ever been in the galleries (like all museums there are a large portion of objects that have never been part of an exhibition) and what objects are there right now.

We created the following views to house this information:


The additions we’ve made to exhibitions are:

There is still some work to be done with exhibitions. This includes figuring out a way to handle object rotations (the process of swapping out some objects mid-exhibition) and outgoing loans (the process of lending objects to other institutions for their exhibitions). We’re expecting that objects on loan should say where they are, and in which external exhibition they are part of — creating a valuable public ‘trail’ of where an object has traveled over its life.


Over the summer, we began an ongoing effort to ‘tag’ all the objects that would appear on the multitouch tables. This includes everything on display, plus about 3,000 objects related to those. The express purpose for tags was to provide a simple, curated browsing experience on the interactive tables – loosely based around themes ‘user’ and ‘motif’. Importantly these are not unstructured, and Sara Rubinow did a great job normalizing them where possible, but there haven’t been enough exhibitions, yet, to release a public thesaurus of tags.

We also added tags to the physical object labels to help visitors draw their own connections between our objects as they scan the exhibitions.

On the website, we’ve added tags in a few places:

That’s it for now – happy exploring! I’ll follow up with more new features once we’re able to make the associated data public.

Until then, our complete list of API methods is available here.

A colophon for bias

The term [colophon] derives from tablet inscriptions appended by a scribe to the end of a … text such as a chapter, book, manuscript, or record. In the ancient Near East, scribes typically recorded information on clay tablets. The colophon usually contained facts relative to the text such as associated person(s) (e.g., the scribe, owner, or commissioner of the tablet), literary contents (e.g., a title, “catch” phrase, number of lines), and occasion or purpose of writing.


A couple of months ago we added the ability to search the collections website by color using more than one palette. A brief refresher: Our search by color functionality works by first extracting the dominant palette for an index. That means the top 5 colors out of a possible 32 million choices. 32 million is too large a surface area to search against so each of the five results are then “snapped” to their closest match on a much smaller grid of possible colors. These matches are then indexed and used to query our database when someone searches for objects matching a specific color.

It turns out that the CSS3 color palette which defines a fixed set of 138 colors is an excellent choice for doing this sort of thing. CSS is the acronym for Cascading Style Sheets (CSS) which is a “language used to describe the presentation” of a webpage separate from its content. Instead of asking people searching the collections website to be hyper-specific in their queries we take the color they are searching for and look for the nearest match in the CSS palette.

For example: #ef0403 becomes #ff0000 or “red”. #f2e463 becomes #f0e68c or “khaki” and so on.

This approach allows us to not only return matches for a specific color but also to show objects that are more like a color than not. It’s a nice way to demonstrate the breadth of the collection and also an invitation to pair objects that might never be seen together.


From the beginning we’ve always planned to support multiple color palettes. Since the initial search-by-color functionality was built in a hurry with a focus on seeing whether we could get it to work at all adding support for multiple palettes was always going to require some re-jiggering of the original code. Which of course means that finding the time to make those changes had to compete with the crush of everything else and on most days it got left behind.

Earlier this year Rebecca Alison Meyer the 6-year old daughter of Eric Meyer, a long-standing member of the CSS community, died of cancer. Eric’s contributions and work to promote the CSS standard can not be overstated. The web would be an entirely other (an entirely poorer) space without his efforts and so some people suggested that a 139th color be added to the CSS Color module to recognize his work and honor his daughter. In June Dominique Hazaël-Massieux wrote:

I’m not sure about how one goes adding names to CSS colors, and what the specific purpose they fulfill, but I think it would be a good recognition of @meyerweb ‘s impact on CSS, and a way to recognize that standardization is first and foremost a social process, to name #663399 color “Becca Purple”.

In reply Eric Meyer wrote:

I have been made aware of the proposal to add the named color beccapurple (equivalent to #663399) to the CSS specification, and also of the debate that surrounds it.

I understand the arguments both for and against the proposal, but obviously I am too close to both the subject and the situation to be able to judge for myself. Accordingly, I let the editors of the Colors specification know that I will accept whatever the Working Group decides on this issue, pro or con. The WG is debating the matter now.

I did set one condition: that if the proposal is accepted, the official name be rebeccapurple. A couple of weeks before she died, Rebecca informed us that she was about to be a big girl of six years old, and Becca was a baby name. Once she turned six, she wanted everyone (not just me) to call her Rebecca, not Becca.

She made it to six. For almost twelve hours, she was six. So Rebecca it is and must be.

Shortly after that #663399 or rebeccapurple was added to the CSS4 Colors module specification. At which point it only seemed right to finally add support for multiple color palettes to the collections website.


Over the course of a month or so, in the margins of day, all of the search-by-color code was rewritten to work with more than a single palette and now you can search the collection for objects in the shade of rebeccapurple.

In addition to the CSS3 and CSS4 color palettes we also added support for the Crayola color palette. For example, the closest color to “rebeccapurple” in the Crayola scheme of things is “cyber grape”.

You can see all the possible nearest-colors for an object by appending /colors to an object page URL. For example:

The dominant color for this object is #683e7e which maps to #58427c or “cyber grape” in Crayola-speak and #483d8b or “dark slate blue” in CSS3-speak and #663399 or “rebeccapurple” in CSS4-speak.

Now that we’ve done the work to support multiple palettes the only limits to adding more is time and imagination. I would like to add a greyscale palette. I would like to add one or more color-blind palettes. I would especially like to add a “blue” palette – one that spans non-photo blue through International Klein Blue all the way to Kind of Bloop midnight blue just to see where along that spectrum objects which aren’t even a little bit blue would fall.

Screen Shot 2014-10-26 at 12.42.02 PM

The point being that there are any number of color palettes that we can devise and use as a lens through which to see our collection. Part of the reason we chose to include the Crayola color palette in version “2” of search-by-color is because the colors they’ve chosen have been given expressive names whose meaning is richer than the sum of their descriptive parts. What does it mean for an object’s colors to be described as macaroni and cheese-ish or outer space-ish in nature? Erika Hall’s 2007 talk Copy is Interface is an excellent discussion of this idea.

I spoke about some of these things last month at the The Search is Over workshop, in London. I described the work we have done on the collections website, to date, as a kind of managing of absence. Specifically the absence of metadata and ways to compensate for its lack or incompleteness while still providing a meaningful catalog and resource.

It is through this work that we started to articulate the idea that: The value of the whole in aggregate, for all its flaws, outweighs the value of a perfect subset. The irregular nature of our collection metadata has also forced us to consider that even if there were a single unified interface to convey the complexities of our collection it is not a luxury we will enjoy any time soon.


Further the efforts of more and more institutions (the Cooper Hewitt included) to embark on mass-digitization projects forces an issue that we, as a sector, have been able to side-step until now: That no one, including lots of people who actually work at museums, have ever seen much of the work in our collections. So in relatively short-order we will transition from a space defined by an absence of data to one defined by a surfeit of, at the very lest, photographic evidence that no one will know how to navigate.

To be clear: This is a good problem to have but it does mean that we will need to starting thinking about models to recognize the shape of the proverbial elephant in the room and building tools to see it.

It is in those tools that another equally important challenge lies. The scale and the volume of the mass-digitization projects being undertaken means that out of necessity any kind of first-pass cataloging of that data will be done by machines. There simply isn’t the time (read: money) to allow things to be cataloged by human hands and so we will inevitably defer to the opinion of computer algorithms.

This is not necessary as dour a prediction as it might sound. Color search is an example of this scenario and so far it’s worked out pretty well for us. What search-by-color and other algorithmic cataloging points to is the need to develop an iconography, or a colophon, to indicate machine bias. To design and create language and conventions that convey the properties of the “extruder” that a dataset has been shaped by.


Those conventions don’t really exist yet. Bracketing search by color with an identifiable palette (a bias) is one stab at the problem but there are so many more places where we will need to signal the meaning (the subtext?) of an automated decision. We’ve tried to address one facet of this problem with the different graphic elements we use to indiciate the reasons why an object may not have an image.



Left to right: We’re supposed to have a picture for this object… but we can’t find it; This object has not been photographed; This object has been photographed but for some reason we’re not allowed to show it to you… you know, even though it’s been acquired by the Smithsonian.

Another obvious and (maybe?) easy place to try out this idea is search itself. Search engines are not, in fact, magic. Most search engines work the same way: A given string is “tokenized” and then each resultant piece is “filtered”. For the example the phrase “checkered Girard samples” might typically be tokenized by splitting things on whitespace but you could just as easily tokenize it by any pattern that can be expressed to a computer. So depending on your tokenized you might end up with a list like:

  • checkered
  • Girard
  • samples


  • checkered Girard
  • samples

Each one of those “tokens” are then analyzed and filtered according to their properties. Maybe they get grouped by their phonetics, which is essentially how the snap-to-grid trick works for the collection’s color search. Maybe they are grouped by what type of word they are: proper nouns, verbs, prepositions and so on. I’ve never actually seen a search engine that does this but there is nothing technically to prevent someone from doing it either.

The simplest and dumbest thing would be to indicate on a search results page that your query results were generated using one or more tokenizers or filters. In our case that would be (1) tokenizer and (5) filters.


    1. Unicode Standard Annex #29


      1. Remove English possessives
      2. Lowercase all tokens
      3. Ingore a set list of stopwords
      4. Stem tokens according to the Porter Stemming Algorithm
      5. Convert non-ascii characters to ascii

That’s not very sexy or ooh-shiny but not everything needs to be. What it does, though, is provide a measure of transparency for people to gauge the reality that any result set is the product of choices which may have little or no relationship to the question being asked or the person asking that question.

These are devices, for sure, and they are not meant to replace a more considered understanding or contemplation of a topic but they can act as an important shorthand to indicate the arc of an answer’s motive.


And that’s just for search engines. Now imagine what happens when we all start pointing computer vision algorithms at our collections…

Update: Since publishing this blog post the nice people working on the GOV.UK websites launched “info” pages. Visitors can now append /info to any of the pages on the website will and see what and who and how that part of the website is supposed to do. Writing about the project they say:

An ‘info’ page contains the user needs the page is intended to meet … Providing an easy way to jump from content to the underpinning needs allows content designers coming to a new topic to understand the need and build empathy with the users quicker. Publishing the GOV.UK user needs should also make the team’s work more transparent and traceable.