Category Archives: Collection data

Print The Exhibition – The Label Book Generator

As a Peter A. Krueger intern this summer, I am working in both the Digital and Emerging Media and Cross-Platform Publishing Departments at the Cooper Hewitt. Since I am traversing the two departments, a project that allows me to learn from each and create something that benefits both is of course ideal. The Label Book Generator does this in a twofold manner: It allows me the opportunity to learn and write code to develop a digital product, which in turn, serves to produce a physical publication of interpretive content for an exhibition.

09

Label Book Generator–’How Posters Work’ exhibition page

Currently a prototype, The Label Book Generator is a tool that creates a printed publication of object labels for each exhibition at the Cooper Hewitt. In its most basic use, Visitor Services at the museum can navigate to an exhibition from a list on the website’s homepage and once on an exhibition page, press Command-P (or File > Print) to generate a PDF with an initial cover page followed by a single label on each page–all entirely set in a larger font-size.

What initially prompted the development of this prototype was to solve readability issues visitors may have with existing wall labels. This does not imply that the current label design needs to change or be set in a larger font-size, but instead that the labeling system as a whole should be augmented with something to make them more accessible, to provide a magnifying glass of sorts when needed.

photo (1)

Publication in use in the gallery

The entire process proved to be invaluable as a learning experience. From the start it was obvious that I needed to leverage the museum’s API to access object data by exhibition to ultimately populate fields in each label. As the Label Book Generator website is currently, the selection and order of the fields are in accordance with a predefined template that begins to apply the typographic guidelines of the existing wall labels. As a graphic designer it was particularly interesting for me to consider the meticulous planning that is usually involved in typesetting parallel to the time spent writing the code. Whereas typically, these two processes are dealt with in succession.

Since the end result needed to be a book, I was set on formatting the data in a markdown document that would have typographic styles manually applied in InDesign. A Python script was written to create a markdown document with syntax assigned to each field, e.g., titles would be prepended with ‘#’ to be a top level header, dates with ‘##’ to be a second level, etc.

Stumbling along with my rudimentary skills in Python–and at one point rewriting the whole thing in Javascript, only to go back to Python–led me to conclude that outputting the final document with InDesign can be circumvented. With the much-appreciated help of Micah Walter, it was settled that rather than generating a markdown file, I should instead produce a small web application using Python and Flask as a framework. The most salient aspect of the entire project now being a simple print style sheet for the website that automatically generates the same final document that having to manually use InDesign would have produced (Here is the code available on Github).

With a central concern for typography, the print style sheet seamlessly flows all the content into any fixed page format, which in this case would be a printable PDF. The printed document once bound can be considered an exhibition catalog reduced to its essential elements: A list of every work, with their respective information and descriptions (when available).

06

Interior spread of printed publication

The Label Book Generator solves the initial prompt of assisting those hard of seeing. However, considering that the website from the get-go is built with a responsive layout and scalable typography (again due to the simultaneity of graphic design and web development) there are a number of opportunities to expand it’s role and purpose.

The typography, padding and margins set in REMs (Root EM), rather than fixed sizes, allows for the ability to control the base size and relatively adjust the measurements. A future version of the website can include in the interface a means to control how large or small the base size of the document should be, given the dimensions of the fixed format–whether it be a standard letter-sized PDF, or otherwise.

10

Browser print dialog box

01

Cover page of printed publication generated from the website

02

Interior spread of printed publication

When presenting the prototype to others here at the museum Katie Shelly brought up an interesting future use case involving blind visitors and screen reader software. In addition to the possibilities with printable versions of the Label Book Generator, the website itself provides a responsive mobile view of all the labels which could theoretically be read to the visitors via their personal device.

iphone_01

Mobile view of website

Finally, the printed label book serves as a means to visualize the collection database. If a label in the book and website is missing a field, it reflects an oversight at the ‘source of truth’. In other words, there is a one-to-one relationship between the fields in both the labels and the database. Ultimately, this brings to mind the commonplace workflow of producing wall labels that are manually written, designed, and edited (on this topic see also: Label Whisperer). In perhaps a later version, a similar process of using the museum’s API to automate the process of generating the label book, could theoretically be applied to the entire production of wall labels for the museum.

Missing tags for the object on recto

Missing tags for ‘Amerika’

Give the Generator a go!

Object concordances – what is the simplest thing to match like with like?

eames-concordances-full

Do you notice anything special about this screenshot of Charles Eames’ famous No. 670 Chair?

It might be hard to see because it’s a tall screenshot and this is a small thumbnail. Have a look at the large version. Hint: It’s not the part where the chair is missing in the picture. It’s actually this, on the right-hand side of the object details:

eames-concordances-crop

Object concordances! With other museums! To the same objects in their collections!! On their own websites !!!

Before you get too excited (and think its actual working ‘Linked Data’), we should point out that as of this writing we have only “concordified” four distinct objects – this one, this one, this one and that one – eight times with four separate organizations, one of which is our own shop, so there is a lot of work left to do.

Screen Shot 2015-05-28 at 6.57.20 PM

If you look carefully you can see that most of the concordances, to date, were added within about 90 minutes of one another. That’s because Seb and I were talking about object concordances over lunch that day and agreed that we could probably push the simplest and dumbest thing out the door before I went home. It has been something that has been on the agenda since mid-2012.

Specifically, we maintain a fixed list of institutions with whom we will “concordify” objects. If your institution isn’t on that list yet it’s not personal. We can add as many institutions as we want but we think the narrow focus helps to explain the purpose of the tool. Then we simply record that institutions unique ID, the object ID for something in our collection and the object ID for something in their collection. That’s it.

Screen Shot 2015-05-28 at 6.10.49 PM

Currently the tools for adding concordances, or editing institutions, are … terrible.

(Or rather, they are the unadorned plumbing that makes the whole thing work. So they are beautiful and elegant in their own way but most people would be forgiven for not seeing those qualities right away.)

Short-term the goal is to build some friendlier “admin” web page for a few more people to add concordances without having to worry about the technical details. Medium-term the goal is to create restricted API methods for doing fancy-pants buttons and pop-up dialogs on the object pages themselves to allow staff to add concordances as they think of them or are otherwise just poking around the collections website. Maybe in the long term, ‘the crowd’ might be invited to do it too.

Screen Shot 2015-05-28 at 6.11.02 PM

Somewhere between those two things we will also build proper “index” pages on the collections website of all the objects that have been concordified, all the institutions that have concordified objects and so on. Just like we’ve already done for people.

The other thing we’ll do shortly is make sure that these concordances are included in the CC0 Cooper Hewitt collections metadata dump which is available on GitHub.

When we said “the simplest thing” we meant it.

There isn’t much yet but it’s a start – a tangible proof of what it could be – and if we’ve done our job right then it is one of those things that will grow exponentially, as always, as time and circumstance permit.

(If you’ve been a long time reader you might remember we did Rijkscolors back in 2013 as an experiment in automatically matching objects – but we were undone by language and structural differences in metadata, and the reality that humans might still be better at this at least until the sector irons a few things out)

Collect all the things – shoeboxes, shop items and the Pen

Screen Shot 2015-05-27 at 6.15.22 PM

You can now collect any object in the collection, or on display, from the collections website itself. Just like in the galleries there is a small “collect” icon on the top right-hand side of every object page on the collections website. It’s not just individual object pages but also all the object list pages, too. So many “collect” icons!

20150527-shoebox-visit-icon

  Objects that haven’t been collected yet have a grey icon.

  Objects that have been collected in the galleries, as part of a visit to the museum, have a pink icon.

  Objects that that have been collected on the collections website have an orange icon.

Simply click the grey icon to collect an object or click one of the orange or pink icons to remove or un-collect that object.

That’s it!

Screen Shot 2015-05-27 at 6.14.55 PM

Just like visit items, things you collect on the website have a permanent URL that can be made public to share with other people and can be given a bespoke title or description. Objects that you collect on the collections website live in something we’re calling the “shoebox”.

You can get your to shoebox by visiting https://collection.cooperhewitt.org/users/YOUR-USERNAME/shoebox or if you’re already logged in to your Cooper Hewitt account by visiting https://collection.cooperhewitt.org/you/shoebox/.

There is also a handy link in the Your stuff menu, located at the top-left of every page on the collections website.

The shoebox is the set of all the objects you’ve collected (or created) on the website or during your visits to the museum. Although visits and visit items overlap with things in your shoebox we still treat them differently because although you need to be logged in to you Cooper Hewitt account to add things to your shoebox a visit to the museum can be entirely anonymous if a visitor so chooses.

The default view for the shoebox is to display everything together in reverse-chronological order but you can filter the view to show only things collected online or things collected during a visit. You can also see the set of all the objects you’ve made public or private.

20150528-shoebox-loggedout

logged out view (large version)

2015-shoebox-loggedin

logged in view (large version)

But it’s not just objects, either. You can already collect videos during your museum visit so those are included too. Ultimately the only limit to what you might collect with the Pen is time-and-typing. Things we’re thinking about making collect-able include: entire exhibitions or the introductory texts on the wall for an exhibition or people or individual rooms in the Mansion.

Museum retail

We’ve started this process by allowing you to collect things in the museum Shop.

By “things in the Shop” we mean all the things that have ever been sold in the Shop over the years. And by “all the things” we mean almost all the things. There is some technical hoop-jumping related to inventory management systems and that is why we don’t have everything yet but we’ll get there in time.

We are a captial-D design museum with a capital-D design shop and many of the things that have been available in the Shop have gone on to become part of our permanent collection so it only makes sense to give them a home on the collections website. In fact MoMA already does similarly with their “find related products in the MoMA Store” feature though ours is a bit different.

20150527-shop-landing

You can see for yourself at https://collection.cooperhewitt.org/shop

The /shop section is divided in two parts: Brands and Items (and all the items for a given brand of course). There isn’t a whole lot of extra information beyond titles and links to the SHOP Cooper Hewitt website for those items that are currently in-stock but it’s a start. Like the rest of the collections website we’ve started with the idea that providing permanent stable URLs that people can have confidence we create something that can be improved on over time.

20150527-shop-brands-crop

Shop items and brands don’t get updated as regularly as we’d like yet. We are still working through the fiddly details of bridging our systems with the Shop’s ecommerce and POS system and some things still need to be done by hand. We’ve been able to get this far though so we expect things will only get better.

20150527-shoebox-listview-shop

You might be wondering…

You might be reading this and starting to wonder Hmmm… does that mean I can also collect things in the Shop as I walk around the museum with the Pen? the answer is… Yes!

As of this writing there are only one or two items that can be collected with the Pen because the Shop staff are still getting familiar with the tools and thinking about how making collect-able labels changes in their day-to-day workflow. The obvious future of this might be the infamous ‘wedding register’, however we believe that many museum visitors actually would like to bookmark objects to possibly buy later, or just remember as part of their overall visit to the ‘museum campus’.

Practically what that has meant are some changes to Sam‘s “tag writer” application (the subject of a future blog post) to fetch shop items via our API and then letting the Shop folks decide what they want to tag and when they want to do it.

There has been a whole lot of change here over the course of the last three years and allowing the various parts of the museum warm up to the possibilities that the Pen starts to afford at their own pace and with not only a minimum of fuss but plenty of wiggle-room for experimentation is really important.

In the meantime we hope that you enjoy collecting at least more, if not all, of the things that make up the museum.

Sorting, Synonyms and a Pretty Pony

We’ve been undergoing a massive rapid-capture digitization project here at the Cooper Hewitt, which means every day brings us pictures of things that probably haven’t been seen for a very, very long time.

As an initial way to view all these new images of objects, I added “date last photographed” to our search index and allowed it to be sorted by on the search results page.

That’s when I found this.

[collection_object id=18692335]

I hope we can all agree that this pony is adorable and that if there is anything else like it in our collection, it needs to be seen right now. I started browsing around the other recently photographed objects and began to notice more animal figurines:

[collection_object id=18460201]

[collection_object id=18615463]

As serendipitous as it was that I came across this wonderful collection-within-a-collection by browsing through recently-photographed objects, what if someone is specifically looking for this group? The whole process shows off some of the work we did last summer switching our search backend over to Elasticsearch (which I recently presented at Museums and the Web). We wanted to make it easier to add new things so we could provide users (and ourselves) with as many “ways in” to the collection as possible, as it’s those entry points that allow for more emergent groupings to be uncovered. This is great for somebody who is casually spending time scrolling through pictures, but a user who wants to browse is different from a user who wants to search. Once we uncover a connected group of objects, what can we do to make it easier to find in the future?

Enter synonyms. Synonyms, as you might have guessed, are a text analysis technique we can use in our search engine to relate words together. In our case, I wanted to relate a bunch of animal names to the word “animal,” so that anyone searching for terms like “animals” or “animal figurines” would see all these great little friends. Like this bear.

[collection_object id=18633719]

The actual rule (generated with the help of Wikipedia’s list of animal names) is this:

 "animal => aardvark, albatross, alligator, alpaca, ant, anteater, antelope, ape, armadillo, baboon, badger, barracuda, bat, bear, beaver, bee, bird, bison, boar, butterfly, camel, capybara, caribou, cassowary, cat, kitten, caterpillar, calf, bull, cheetah, chicken, rooster, chimpanzee, chinchilla, chough, clam, cobra, cockroach, cod, cormorant, coyote, puppy, crab, crocodile, crow, curlew, deer, dinosaur, dog, puppy, salmon, dolphin, donkey, dotterel, dove, dragonfly, duck, poultry, dugong, dunlin, eagle, echidna, eel, elephant, seal, elk, emu, falcon, ferret, finch, fish, flamingo, fly, fox, frog, gaur, gazelle, gerbil, panda, giraffe, gnat, goat, sheep, goose, poultry, goldfish, gorilla, blackback, goshawk, grasshopper, grouse, guanaco, fowl, poultry, guinea, pig, gull, hamster, hare, hawk, goshawk, sparrowhawk, hedgehog, heron, herring, hippopotamus, hornet, swarm, horse, foal, filly, mare, pig, human, hummingbird, hyena, ibex, ibis, jackal, jaguar, jellyfish, planula, polyp, scyphozoa, kangaroo, kingfisher, koala, dragon, kookabura, kouprey, kudu, lapwing, lark, lemur, leopard, lion, llama, lobster, locust, loris, louse, lyrebird, magpie, mallard, manatee, mandrill, mantis, marten, meerkat, mink, mongoose, monkey, moose, venison, mouse, mosquito, mule, narwhal, newt, nightingale, octopus, okapi, opossum, oryx, ostrich, otter, owl, oyster, parrot, panda, partridge, peafowl, poultry, pelican, penguin, pheasant, pigeon, bear, pony, porcupine, porpoise, quail, quelea, quetzal, rabbit, raccoon, rat, raven, deer, panda, reindeer, rhinoceros, salamander, salmon, sandpiper, sardine, scorpion, lion, sea urchin, seahorse, shark, sheep, hoggett, shrew, skunk, snail, escargot, snake, sparrow, spider, spoonbill, squid, calamari, squirrel, starling, stingray, stinkbug, stork, swallow, swan, tapir, tarsier, termite, tiger, toad, trout, poultry, turtle, vulture, wallaby, walrus, wasp, buffalo, carabeef, weasel, whale, wildcat, wolf, wolverine, wombat, woodcock, woodpecker, worm, wren, yak, zebra"

Where every word to the right of the => automatically gets added to a search for a word to the left.

Not only does our new search stack provide us with a useful way to discover emergent relationships, but it makes it easy for us to “seal them in,” allowing multiple types of user to get the most from our collections site.

How re-opening the museum enhanced our online collection: new views, new API methods

At the backend of our museum’s new interactive experiences lies our API, which is responsible for providing the frontend with all the data necessary to flesh out the experience. From everyday information like an object’s title to more novel features such as tags, videos and people relationships, the API gathers and organizes everything that you see on our digital tables before it gets displayed.

In order to meet the needs of the experiences designed for us by Local Projects on our interactive tables, we added a lot of new data to the API. Some of it was sitting there and we just had to go find it, other aspects we had to generate anew.

Either way, this marks a huge step towards a more complete and meaningful representation of our collection on the internet.

Today, we’re happy to announce that all of this newly-gathered data is live on our website and is also publicly available over the API (head to the API methods documentation to see more about that if you’re interested in playing with it programmatically).

People

For the Hewitt Sisters Collect exhibition, Local Projects designed a front-end experience for the multitouch tables that highlights the early donors to the museum’s collection and how they were connected to each other. Our in-house “TMS liaison”, Sara Rubinow, worked to gather and structure this information before adding it to TMS, our collection management system, as “constituent associations”. From there I extracted the structured data to add to our website.

We created a the following new views on the web frontend to house this data:

We also added a few new biography-related fields: portraits or photographs of Hewitt Sisters people and two new biographies, one 75 words and the other 50 characters. These changes are viewable on applicable people pages (e.g. Eleanor Garnier Hewitt) and the search results page.

The overall effect of this is to make more use of this ‘people-related’ data, and to encourage the further expansion of it over time. We can already imagine a future where other interfaces examining and revealing the network of relationships behind the people in our collection are easily explored.

Object Locations and Things On Display

Some of the more difficult tasks in updating our backend to meet the new requirements related to dealing with objects no longer being static – but moving on and off display. As far as the website was concerned, it was a luxury in our three years of renovation that objects weren’t moving around a whole lot because it meant we didn’t have to prioritize the writing of code to handle their movement.

But now that we are open we need to better distinguish those objects in storage from those that are on display. More importantly, if it is on display, we also need to say which exhibition, and which room it is on display.

Object locations have a lot of moving parts in TMS, and I won’t get into the specifics here. In brief, object movements from location to location are stored chronologically in a database. The “movement” is its own row that references where it moved and why it moved there. By appropriately querying this history we can say what objects have ever been in the galleries (like all museums there are a large portion of objects that have never been part of an exhibition) and what objects are there right now.

We created the following views to house this information:

Exhibitions

The additions we’ve made to exhibitions are:

There is still some work to be done with exhibitions. This includes figuring out a way to handle object rotations (the process of swapping out some objects mid-exhibition) and outgoing loans (the process of lending objects to other institutions for their exhibitions). We’re expecting that objects on loan should say where they are, and in which external exhibition they are part of — creating a valuable public ‘trail’ of where an object has traveled over its life.

Tags

Over the summer, we began an ongoing effort to ‘tag’ all the objects that would appear on the multitouch tables. This includes everything on display, plus about 3,000 objects related to those. The express purpose for tags was to provide a simple, curated browsing experience on the interactive tables – loosely based around themes ‘user’ and ‘motif’. Importantly these are not unstructured, and Sara Rubinow did a great job normalizing them where possible, but there haven’t been enough exhibitions, yet, to release a public thesaurus of tags.

We also added tags to the physical object labels to help visitors draw their own connections between our objects as they scan the exhibitions.

On the website, we’ve added tags in a few places:

That’s it for now – happy exploring! I’ll follow up with more new features once we’re able to make the associated data public.

Until then, our complete list of API methods is available here.

A colophon for bias

The term [colophon] derives from tablet inscriptions appended by a scribe to the end of a … text such as a chapter, book, manuscript, or record. In the ancient Near East, scribes typically recorded information on clay tablets. The colophon usually contained facts relative to the text such as associated person(s) (e.g., the scribe, owner, or commissioner of the tablet), literary contents (e.g., a title, “catch” phrase, number of lines), and occasion or purpose of writing.

Wikipedia

A couple of months ago we added the ability to search the collections website by color using more than one palette. A brief refresher: Our search by color functionality works by first extracting the dominant palette for an index. That means the top 5 colors out of a possible 32 million choices. 32 million is too large a surface area to search against so each of the five results are then “snapped” to their closest match on a much smaller grid of possible colors. These matches are then indexed and used to query our database when someone searches for objects matching a specific color.

It turns out that the CSS3 color palette which defines a fixed set of 138 colors is an excellent choice for doing this sort of thing. CSS is the acronym for Cascading Style Sheets (CSS) which is a “language used to describe the presentation” of a webpage separate from its content. Instead of asking people searching the collections website to be hyper-specific in their queries we take the color they are searching for and look for the nearest match in the CSS palette.

For example: #ef0403 becomes #ff0000 or “red”. #f2e463 becomes #f0e68c or “khaki” and so on.

This approach allows us to not only return matches for a specific color but also to show objects that are more like a color than not. It’s a nice way to demonstrate the breadth of the collection and also an invitation to pair objects that might never be seen together.

search-is-over.020-640

From the beginning we’ve always planned to support multiple color palettes. Since the initial search-by-color functionality was built in a hurry with a focus on seeing whether we could get it to work at all adding support for multiple palettes was always going to require some re-jiggering of the original code. Which of course means that finding the time to make those changes had to compete with the crush of everything else and on most days it got left behind.

Earlier this year Rebecca Alison Meyer the 6-year old daughter of Eric Meyer, a long-standing member of the CSS community, died of cancer. Eric’s contributions and work to promote the CSS standard can not be overstated. The web would be an entirely other (an entirely poorer) space without his efforts and so some people suggested that a 139th color be added to the CSS Color module to recognize his work and honor his daughter. In June Dominique Hazaël-Massieux wrote:

I’m not sure about how one goes adding names to CSS colors, and what the specific purpose they fulfill, but I think it would be a good recognition of @meyerweb ‘s impact on CSS, and a way to recognize that standardization is first and foremost a social process, to name #663399 color “Becca Purple”.

In reply Eric Meyer wrote:

I have been made aware of the proposal to add the named color beccapurple (equivalent to #663399) to the CSS specification, and also of the debate that surrounds it.

I understand the arguments both for and against the proposal, but obviously I am too close to both the subject and the situation to be able to judge for myself. Accordingly, I let the editors of the Colors specification know that I will accept whatever the Working Group decides on this issue, pro or con. The WG is debating the matter now.

I did set one condition: that if the proposal is accepted, the official name be rebeccapurple. A couple of weeks before she died, Rebecca informed us that she was about to be a big girl of six years old, and Becca was a baby name. Once she turned six, she wanted everyone (not just me) to call her Rebecca, not Becca.

She made it to six. For almost twelve hours, she was six. So Rebecca it is and must be.

Shortly after that #663399 or rebeccapurple was added to the CSS4 Colors module specification. At which point it only seemed right to finally add support for multiple color palettes to the collections website.

20140818-rebeccapurple-sm

Over the course of a month or so, in the margins of day, all of the search-by-color code was rewritten to work with more than a single palette and now you can search the collection for objects in the shade of rebeccapurple.

In addition to the CSS3 and CSS4 color palettes we also added support for the Crayola color palette. For example, the closest color to “rebeccapurple” in the Crayola scheme of things is “cyber grape”.

You can see all the possible nearest-colors for an object by appending /colors to an object page URL. For example:

https://collection.cooperhewitt.org/objects/18380795/colors

The dominant color for this object is #683e7e which maps to #58427c or “cyber grape” in Crayola-speak and #483d8b or “dark slate blue” in CSS3-speak and #663399 or “rebeccapurple” in CSS4-speak.

Now that we’ve done the work to support multiple palettes the only limits to adding more is time and imagination. I would like to add a greyscale palette. I would like to add one or more color-blind palettes. I would especially like to add a “blue” palette – one that spans non-photo blue through International Klein Blue all the way to Kind of Bloop midnight blue just to see where along that spectrum objects which aren’t even a little bit blue would fall.

Screen Shot 2014-10-26 at 12.42.02 PM

The point being that there are any number of color palettes that we can devise and use as a lens through which to see our collection. Part of the reason we chose to include the Crayola color palette in version “2” of search-by-color is because the colors they’ve chosen have been given expressive names whose meaning is richer than the sum of their descriptive parts. What does it mean for an object’s colors to be described as macaroni and cheese-ish or outer space-ish in nature? Erika Hall’s 2007 talk Copy is Interface is an excellent discussion of this idea.

I spoke about some of these things last month at the The Search is Over workshop, in London. I described the work we have done on the collections website, to date, as a kind of managing of absence. Specifically the absence of metadata and ways to compensate for its lack or incompleteness while still providing a meaningful catalog and resource.

It is through this work that we started to articulate the idea that: The value of the whole in aggregate, for all its flaws, outweighs the value of a perfect subset. The irregular nature of our collection metadata has also forced us to consider that even if there were a single unified interface to convey the complexities of our collection it is not a luxury we will enjoy any time soon.

search-is-over.023-640

Further the efforts of more and more institutions (the Cooper Hewitt included) to embark on mass-digitization projects forces an issue that we, as a sector, have been able to side-step until now: That no one, including lots of people who actually work at museums, have ever seen much of the work in our collections. So in relatively short-order we will transition from a space defined by an absence of data to one defined by a surfeit of, at the very lest, photographic evidence that no one will know how to navigate.

To be clear: This is a good problem to have but it does mean that we will need to starting thinking about models to recognize the shape of the proverbial elephant in the room and building tools to see it.

It is in those tools that another equally important challenge lies. The scale and the volume of the mass-digitization projects being undertaken means that out of necessity any kind of first-pass cataloging of that data will be done by machines. There simply isn’t the time (read: money) to allow things to be cataloged by human hands and so we will inevitably defer to the opinion of computer algorithms.

This is not necessary as dour a prediction as it might sound. Color search is an example of this scenario and so far it’s worked out pretty well for us. What search-by-color and other algorithmic cataloging points to is the need to develop an iconography, or a colophon, to indicate machine bias. To design and create language and conventions that convey the properties of the “extruder” that a dataset has been shaped by.

search-is-over.033-640

Those conventions don’t really exist yet. Bracketing search by color with an identifiable palette (a bias) is one stab at the problem but there are so many more places where we will need to signal the meaning (the subtext?) of an automated decision. We’ve tried to address one facet of this problem with the different graphic elements we use to indiciate the reasons why an object may not have an image.

missing-nnot-available-n

no-photography-n

Left to right: We’re supposed to have a picture for this object… but we can’t find it; This object has not been photographed; This object has been photographed but for some reason we’re not allowed to show it to you… you know, even though it’s been acquired by the Smithsonian.

Another obvious and (maybe?) easy place to try out this idea is search itself. Search engines are not, in fact, magic. Most search engines work the same way: A given string is “tokenized” and then each resultant piece is “filtered”. For the example the phrase “checkered Girard samples” might typically be tokenized by splitting things on whitespace but you could just as easily tokenize it by any pattern that can be expressed to a computer. So depending on your tokenized you might end up with a list like:

  • checkered
  • Girard
  • samples

Or:

  • checkered Girard
  • samples

Each one of those “tokens” are then analyzed and filtered according to their properties. Maybe they get grouped by their phonetics, which is essentially how the snap-to-grid trick works for the collection’s color search. Maybe they are grouped by what type of word they are: proper nouns, verbs, prepositions and so on. I’ve never actually seen a search engine that does this but there is nothing technically to prevent someone from doing it either.

The simplest and dumbest thing would be to indicate on a search results page that your query results were generated using one or more tokenizers or filters. In our case that would be (1) tokenizer and (5) filters.

Tokenizers:

    1. Unicode Standard Annex #29

Filters:

      1. Remove English possessives
      2. Lowercase all tokens
      3. Ingore a set list of stopwords
      4. Stem tokens according to the Porter Stemming Algorithm
      5. Convert non-ascii characters to ascii

That’s not very sexy or ooh-shiny but not everything needs to be. What it does, though, is provide a measure of transparency for people to gauge the reality that any result set is the product of choices which may have little or no relationship to the question being asked or the person asking that question.

These are devices, for sure, and they are not meant to replace a more considered understanding or contemplation of a topic but they can act as an important shorthand to indicate the arc of an answer’s motive.

search-is-over.038-640

And that’s just for search engines. Now imagine what happens when we all start pointing computer vision algorithms at our collections…


Update: Since publishing this blog post the nice people working on the GOV.UK websites launched “info” pages. Visitors can now append /info to any of the pages on the gov.uk website will and see what and who and how that part of the website is supposed to do. Writing about the project they say:

An ‘info’ page contains the user needs the page is intended to meet … Providing an easy way to jump from content to the underpinning needs allows content designers coming to a new topic to understand the need and build empathy with the users quicker. Publishing the GOV.UK user needs should also make the team’s work more transparent and traceable.

Bravo!

Video Capture for Collection Objects

Stepping inside a museum storage facility is a cool experience. Your usual gallery ambience (dramatic lighting, luxurious swaths of empty space, tidy labels that confidently explain all) is completely reversed. Fluorescent lights are overhead, keycode entry pads protect every door, and official ID badges are worn by every person you see. It’s like a hospital, but instead of patients there are 17th century nightgowns and Art Deco candelabras. Nestled into tiny, sterile beds of acid-free tissue paper and archival linen, the patients are occasionally woken and gently wheeled around for a state-of-the-art microscope scan, an elaborate chemical test, or a loving set of sutures.

A gloved, cardigan-ed museum worker pushing a rolling cart down a hallway of large white shelving units.

A rare peek inside the storage facility.

If you ask a staff member for an explanation of this or that object on the nearest cart or shelf, they might tell you a detailed story, or they might say that so far, not much is known. I like the element of unevenness in our knowledge, it’s very different from the uniform level of confidence one sees in a typical exhibition.

The web makes it possible to open this space to the public in all its unpolished glory – and many other museums have made significant inroads into new audiences by pulling back the curtain. The prospect is like catnip for the intellectually curious, but hemlock for most museum employees.

Typically, the only form of media that escapes this secretive storage facility are hi-res TIFFs artfully shot in an on-site photography studio. The seamless white backdrop and perfectly staged lighting, while beautiful and ideal for documentation, completely belie the working lab environment in which they were made.

We just launched a new video project called “Collections in Motion.” The idea is super simple: short videos that demonstrate collections objects that move, flip, click, fold, or have any moveable part.

Here are some of the underlying thoughts framing the project:

  • Still images don’t suffice for some objects. Many of them have moving parts, make sounds, have a sense of weight, etc that can’t be conveyed through images.
  • Our museum’s most popular videos on YouTube are all kinetic, kinda entrancing, moving objects. (Contour Craft 3D Printing, A Folding Bicycle, and a Pop-up Book, for example).
  • Videos played in the gallery generally don’t have sound or speakers available.
  • In research interviews with various types of visitors, many people said that they wouldn’t be interested in watching a long, involved video in a museum context.
  • Animated GIFs, 6-second Vines, and 15-second Instagram videos loom large in our contemporary visual/communication culture.
  • How might we think of the media we produce (videos, images, etc) as a part of an iterative process that we can learn from over time? Can we get comfortable with a lower quality but higher number of videos going out to the public, and seeing what sticks (through likes, comments, viewcount, etc)?

 

A screenshot from YouTube Analytics showing most popular videos: Contour Crafting, Folding Bicycle, Puss in Boots Pop-up book, et cetera

Our most popular YouTube videos for this quarter. They are all somewhat mesmerizing/cabinet-of-curiosity type things.

Here are some of the constraints on the project:

  • No budget (pairs nicely with the preceding bullet).
  • Moving collections objects is a conservation no-no. Every human touch, vibration and rub is bad for the long-long-longevity of the object (and not to mention the peace of mind of our conservators).
  • Conservators’ and curators’ time is in HIGH demand, especially as we get closer to our re-opening. They are busy writing new books, crafting wall labels, preparing gallery displays, etc. Finding a few hours to pull an object from storage and move it around on camera is a big challenge.

So, nerd world, what do you think?

Dataclimber explores colors in the Cooper Hewitt collection

Rubén Abad's #museumselfie outside of a museum

Rubén Abad’s #museumselfie outside of a museum

A few weeks ago we became aware of Rubén Abad’s poster which shows all the colours in our collection by decade. We sent a few questions over to Spain to find out more . . .

Q: What were some of the precursors to the color poster? What inspired you?

A: The idea came when I first saw Lev Manovich’s ‘Software Takes Command‘ book cover. When I started looking at the data, another couple of paintings came to my mind. For example, Salvador Dalí’s series about visual perception and ‘pixels’, as in Homage to Rothko (The Dalí Museum). By chance, I attended an exhibition here in Madrid where I discovered ‘Study for Index: Map of the World‘, by Art & Language (MACBA). By the time I came back home, it was clear that I wanted to display color evolution over time using a mosaic.

Q: Did you have any expectation about what the final product would look like? Did the end result surprise you?

A: I didn’t have any preconceived notion. I liked to see how groups of pieces appeared.

Q: What were the challenges of working with the dataset? What were the holes, problems? How could we make it better/easier to work with?

A: Being used to work with data made really easy for me to work with the collection’s dataset, so thanks for releasing it! The only complain I might have is having to parse some fields, like medium, to be able to store the information in a more comfortable format to be queried.

Q: What would you like to do next?

A: I have a network of people and objects in mind, in order to display who has the biggest ‘influence’ in the collection.

Q: If other museums made their data available like this, what might you do with it?

A: I’d like to work on a history of the object project. If we were able to access all the dates and places importants in the object history, we could try to cross all the objects info and maybe, it’s never known, find new hubs where pieces happened to be at the same time and why they were there. Another interesting project would be to find gender inequality among collections, not only when looking at artists/designers, but also with donors and funders and even among representations (iconography). Have this roles changed over the years? Are different depending on countries?

Dataclimber's color poster.

Dataclimber’s color poster.

Welcome to object phone. Your call has been placed in a queue.

I made another small thing. Again, another way for me to experiment with the Collection API, and again, another way to experiment with new ways of accessing the collection. This time, there aren’t many screen shots to display–there is no website to look at. This time, it’s “Welcome to object phone!”

(718) 213-4915

Object Phone” is ( presently ) a very, very simple implementation of a way to explore our collection by dialing a telephone, or sending a text message. I had been thinking of a few of the more popular museum oriented audio tour products, and how they all seem to be very CMS style in their design, and wondering if we could just use our own API.

For example, TourML and TAP ( which offer the web programmer a very powerful framework for programming a mobile guide using the Drupal CMS ) are very nice, but they are still very dependent on content production. The developer or content manager has to build and curate all of the content for the “tour.” This might be a good way to go about things, especially if you are leaning on an existing Drupal installation for a good deal of your content, but I was looking for a way to access existing data, and specifically the data in our collection website.

In the beginning of developing our collection website, we went through the process of assigning EVERYTHING a unique “bigint” in the form of what we are referring to as an “artisinal integer.” This means that each object record, each person record and each, well, everything else has a unique integer which no other thing can have. This is not in place of accession numbers–we will probably always have accession numbers The nice thing about unique integers is that they’re really easy to deal with on a programmatic level.

For example, if you text 18704235 to 718-213-4915 you should get a response that looks like the screenshot below. In fact you can text any object id number from our collection and get a similar response.

2013-04-18 10.15.18

You can also dial that same number and use your keypad to either search the collection by object ID, or ask for a random object. The application will respond to you using a text to speech converter, which is usually pretty good.

Presently, the app is not replying with a whole lot of information. You essentially get the object’s title and medium field if it has one. In many cases, asking for a random object may just result in something like “Drawing.” Many of our object records don’t have much more useful information than this, and also, I am trying to wrangle with the idea of how much information is useful in a voice and text message ( with a 160 character limit per SMS).

The whole system is leveraging the Twilio service and API. Twilio offers quite a range of possibilities, and I am very excited to experiment with more. For example, instead of text to speech, Twilio can play back .wav files. Additionally, Twilio can do things like dial another phone number, forward calls and record the caller’s voice. There are so many possibilities here that I wont even begin to list them, but for example, I could easily see us using this to capture user feedback in our galleries by phone and text.

I’m very interested in figuring out a way to search by voice. I’m sort of dreaming of programming the thing to go “Why don’t you just tell me the object number!” as in this great episode of Seinfeld which you can watch by clicking the image below.

Screen Shot 2013-04-18 at 10.35.01 AMIf you are interested, I have also made the code public on this Gist. It’s pretty messy and redundant right now, but you’ll get the idea.

One of the more complicated aspects of this project will be designing the phone interface so it makes sense. Currently, once you hear an object play back, the system just hangs up on you. It would be nice to offer the user a better way to manipulate the system which is still pleasant and easy to understand. By that same token, there is a completely different approach that is needed for the SMS end of things as you don’t really have a menu tree, but instead of list of possible commands the user need to learn. Fortunately, there is a ton of great work that has already been accomplished in this arena, specifically by the Walker Art Center’s very long running and very yellow website Art on Call.

Source code at github.com/cooperhewitt/objectphone

"cmd-P"

I made us a print stylesheet for object pages on the collections website. (What does that mean? It means you can print out the webpage and it will look nice).

Printout of Object #18621871 before stylesheet

Printout of Object #18621871.. before stylesheet.

Printout of Object #18621871 after stylesheet. Much better.

Printout of Object #18621871 after stylesheet. Much better. Office carpet courtesy of Tandus flooring.

This should be very useful for us in-house, especially curators and education.. and anyone doing exhibition planning.. (which right now is many of us).

It’s not very fancy or anything. Basically I just stripped away all the extraneous information and got right to the essential details, kind of like designing for mobile.

six printouts on standard paper from the collections website, taped in two rows to an iMac screen.

cascading style sheet is cascading.

In a moment of caffeinated Friday goofiness, Aaron printed out a bunch of weird objects he found (e.g. iPad described for aliens as “rectangular tablet computer with rounded corners”) and Scotch taped them all over Seb’s computer screen as a nice decorative touch for his return the next morning.

What we realized in looking at all the printouts, though, is that the simplified view of a collection record resembles a gallery wall label. And we’re currently knee-deep in the wall label discussion here at the Museum as we re-design the galleries (what does it need? what doesn’t it need? what can it do? how can it delight? how can it inform?).

I don’t yet have any conclusions to draw from that observation.. other than it’s a good frame to talk about our content and its presentation.

..to be continued!