Category Archives: Experimental

Content sharing and ambient display with Electric Objects EO1

Long live RSS

I just made a new Tumblr. It’s called “Recently Digitized Design.” It took me all of five minutes. I hope this blog post will take me all of ten.

But it’s actually kinda cool, and here’s why. Cooper Hewitt is in the midst of mass digitization project where we will have digitized our entire collection of over 215K objects by mid to late next year. Wow! 215K objects. That’s impressive, especially when you consider that probably 5000 of those are buttons!

What’s more is that we now have a pretty decent “pipeline” up and running. This means that as objects are being digitized and added to our collections management system, they are automatically winding up on our collections website after winding their way through a pretty hefty series of processing tasks.

Over on the West Coast, Aaron, felt the need to make a little RSS feed for these “recently digitized” so we could all easily watch the new things come in. RSS, which stands for “Rich Site Summary”, has been around forever, and many have said that it is now a dead technology.

Lately I’ve been really interested in the idea of Microservices. I guess I never really thought of it this way, but an RSS or ATOM feed is kind of a microservice. Here’s a highlight from “Building Microservices by Sam Newman” that explains this idea in more detail.

Another approach is to try to use HTTP as a way of propagating events. ATOM is a REST-compliant specification that defines semantics ( among other things ) for publishing feeds of resources. Many client libraries exist that allow us to create and consume these feeds. So our customer service could just publish an event to such a feed when our customer service changes. Our consumers just poll the feed, looking for changes.

Taking this a bit further, I’ve been reading this blog post, which explains how one might turn around and publish RSS feeds through an existing API. It’s an interesting concept, and I can see us making use of it for something just like Recently Digitized Design. It sort of brings us back to the question of how we publish our content on the web in general.

In the case of Recently Digitized Design the RSS feed is our little microservice that any client can poll. We then use IFTTT as the client, and Tumblr as the output where we are publishing the new data every day.

RSS certainly lives up to its nickname ( Really Simple Syndication ), offering a really simple way to serve up new data, and that to me makes it a useful thing for making quick and dirty prototypes like this one. It’s not a streaming API or a fancy push notification service, but it gets the job done, and if you log in to your Tumblr Dashboard, please feel free to follow it. You’ll be presented with 10-20 newly photographed objects from our collection each day.

UPDATE:

Today in the office . . . “Twitterbot or it doesn’t exist” (especially for @micahwalter)

— Seb Chan (@sebchan) July 10, 2015

So this happened: https://twitter.com/recentlydigital

Things people make with our API #347: Nick Bartzokas

1 Reply

Shortly after Cooper Hewitt opened on December 12, 2014, the museum hosted a private event. At that preliminary scoping for the event, I bumped into Nick Bartzokas who had written a spiffy little application that he was planning on using for visuals on the night. We got talking and it turned out that he’d made it using the Cooper Hewitt API – all with no prompting. Even though it didn’t end up getting fully used, he has released it along with the source code.

Tell me a bit about yourself, what do you do, where do you do it?

I’m a creative coder. I like trying out new things. That’s lead me to develop a wide variety of projects: educational games, music visualizations, a Kinect flight simulator, an interactive API-fed wall of Arduinos and Raspberry Pis. These days I’m making interactive installations for the LAB at Rockwell Group. I came to the LAB from the American Museum of Natural History, so museums are in my blood, too.

The LAB is a unique place. We’re a team of designers, thinkers, and technologists exploring ways to connect the digital with the physical.

Here’s a couple links to our work: (1 / 2)

You made a web app for an event at Cooper Hewitt, what was the purpose of it, what does it do?

Our friends at Metropolis celebrated their magazine’s redesign at the Cooper Hewitt in December 2014. The LAB worked on a one-night-only interactive installation that ran on one of the museum’s 84″ touchtables. We love to experiment, so when opportunities like this come up, we jump at the chance to pick up a new tool and create.

In preparation for the event, I decided to prototype using Phaser, a 2D Javascript game framework. It markets itself as a tool for making web platformers, but it’s excellent for 2D projects of all kinds.

It gives you an update and render cycle that’s familiar territory for those that work with other game engines or creative coding toolkits like openFrameworks. It handles user input and asset management well. It has three physics engines of ranging sophistication, from simple Arcade collisions to full-body physics. You can choreograph sprites using built-in tweening. It has PIXI integrated under the hood, which supplies fast graphics with useful shaders and the ability to roll your own. So, lots of range. It’s a great tool for rapid browser-based prototyping.

The prototype we completed for the event brought Metropolis magazine’s digital assets to life. Photos drifted like leaves on a pond. When touched, they attracted photos of similar objects, assembling into flower petals and fans. If held, they grew excited until bursting apart. It ran in a fullscreened browser and was reponsive to over 40 simultaneous touch points. Here’s that version in action.

For the other prototype, I used Cooper Hewitt’s API to generate fireworks made of images from the museum’s collection. Since the collection is organized by color, I could ask the API for all the red images in the collection and turn them into a red firework burst.

I thought this project was really cool, so while it wasn’t selected for the Metropolis event, I decided to complete it anyway and post it..

OMG! You used the Cooper Hewitt API! How did you find out about the API? What was it like to work with the API? What was the best and the worst thing about the API?

When the LAB begins a project, we start by considering the story. We were celebrating the Metropolis magazine redesign. Of course that was the main focus. But their launch party was being held at the Cooper Hewitt, and they wrote about Caroline Baumann of the Cooper Hewitt in their launch issue, so the museum was a part of the story. We began gathering source material from Metropolis and Cooper Hewitt. It was then that I re-discovered the Cooper Hewitt API. It was something I’d heard about in the buzz leading up to the museum’s reopening, but this was my first time encountering it in the wild.

You all did a great job! Working with the API was so straightforward. Everything was well designed. The API website is simple and useful. The documentation is clear and complete with the ability to testdrive API methods in the browser. The structure of the API is sensible and intuitive. I taught a class on API programming for beginners. It was a challenge to select APIs with a low barrier to entry that beginners would be excited about and capable of navigating. Cooper Hewitt’s API is on my list now. I think beginners would find it quick, easy, and rewarding.

The pyramid diagram on the home page was a nice touch, a modest infographic with a big story behind it. It gives the newcomer a birds eye view of the API, the new gallery apps, the redesigned museum, all the culmination of a tremendous collaboration.

The ability to search the collection by color immediately jumped out to me. That feature is just rife with creative possibilities. My favorite part, no doubt. In fact, I think it’s worth expanding on the API’s knowledge of color. It knows an image contains blue, but perhaps it could have some sense of how much blue the image contains, perhaps a color average or a histogram.

In preparing a nodejs app to pull images for the fireworks, I checked to see if someone had written a node module for the Cooper Hewitt API, expecting I’d have to write my own. I was pleasantly surprised to see that the museum’s own Micah Walter authored one . That was another wow moment. When an institution opens up an API, that’s good. But this is really where Cooper Hewitt is building a bridge to the development community. It’s the little things.

So if others want to play with what you made where can they find it?

Folks can interact with the prototype here and they can peek at the source code on GitHub.

Thanks for having me, and congratulations on the API, the museum’s reopening, and a job well done!

Rethinking Search on the Collections Site

10 Replies

One of my longer-term projects since joining the museum has been rethinking how the search feature functions on the collections website. As we get closer to re-opening the museum with a suite of new technologies, our work in collaboration with Local Projects has prompted us to take a close look at the moving pieces that comprise the backend of our collections site and API. Search, naturally, forms a large piece of that. Last week, after a few weeks of research and experimentation, I pushed the first iteration live. In this post, I’ll share some of the thoughts and challenges that informed our changes.

First, a glossary of terms for readers who (like me, a month ago) have little-to-no experience with the inner-workings of a search engine:

Platform: The software that actually does the searching. The general process is that we feed data to the platform (see “index”), and then we ask it for results matching a certain set of parameters (see “query”). Everything else is handled by the platform itself. Part of what I’ll get into below involves our migration from one platform, Apache Solr, to another, Elasticsearch.
Index: An index is the database that the search platform uses to perform searches on. The search index is a lot like the primary database (it probably could fill that role if it had to) but it adds extra functionality to facilitate quick and accurate retrieval of search results.
Query: The rules to follow in selecting things that are appropriate to provide as search results. For users, the query could be something like “red concert poster,” but we have to translate that into something that the search provider will understand before results can be retrieved. Search providers give us a lot of different ways we can query things (ranges of a number, geographic distance or word matching to name a few), and a challenge for us as interface designers is to decide how transparent we want to make that translation. Queries also allow us to define how results should be sorted and how to facet results.
Faceting/Aggregation: A way of grouping results based on traits they posses. For example, faceting on “location” when you search our collection for “cat” reveals that 80 of our cat-related things are from the USA, 16 are from France, and so on.
Analysis (Tokenization/Stemming etc): A process that helps a computer work with sentences. Tokenization, for example, would split a search for “white porcelain vase” into the individual tokens: “white,” “porcelain” and “vase,” and then perform a search for any number of those tokens. Another example is stemming, which would allow the platform to understand that if a user searches for “running,” then items containing other words like “run” or “runner” are also valid search results. Analysis also gives us the opportunity to define custom rules that might include “marathon” and “track” as valid results in a search for “running.”

The State of Search

Our old search functionality showed its symptoms of under-performance in a few ways. For example, basic searches — phrases like “red concert poster” — turned up no results despite the presence of such objects in our collection, and searching for people would not return the person themselves, only their objects. These symptoms led me to identify what I considered the two big flaws in our search implementation.

On the backend, we were only indexing objects. This meant that if you searched for “Ray Eames,” you would see all of the objects we have associated with her, but to get to her individual person page, you would have to first click on an object and then click on her name. Considering that we have a lot of non-objects¹, it makes sense to index them all and include them, where relevant, in the results. This made my first objective to find a way to facilitate the indexing and querying of different types of things.

On the frontend, we previously gave users two different ways to search our collection. The default method, accessible through the header of every page, performed a full text search on our Solr index and returned results sorted by image complexity. Users could also choose the “fancy search” option, which allows for searches on one or more of the individual fields we index, like “medium,” “title,” or “decade.” We all agreed here that “fancy search” was confusing, and all of its extra functionality — faceting, searching across many fields — shouldn’t be seen as “advanced” features. My second objective in rethinking how search works, then, was to unify “fancy” and “regular” search into just “search.”

Objective 1: Update the Backend

Our search provider, Solr, requires that a schema be present for every type of thing being indexed. The schema (an XML file) tells Solr what kind of value to expect for a certain field and what sort of analysis to perform on the field. This means I’d have to write a schema file — anticipating how I’d like to form all the indexed data — for each new type of thing we want to search on.

One of the features of Elasticsearch is that it is “schemaless,” meaning I can throw whatever kind of data I want at the index and it figures out how to treat it. This doesn’t mean Elasticsearch is always correct in its guesses — for example, it started treating our accession numbers as dates, which made them impossible to search on — so it also gives you the ability to define mappings, which has the same effect as Solr’s schema. But if I want to add “people” to the index, or add a new “location” field to an object, using Elasticsearch means I don’t have to fiddle with any schemas. This trait of Elasticsearch alone made worth the switch (see Larry Wall’s first great virtue of programmers, laziness: “the quality that makes you go to great effort to reduce overall energy expenditure”) because it’s important to us that we have the ability to make quick changes to any part of our website.

Before building anything in to our web framework, I spent a few days getting familiar with Elasticsearch on my own computer. I wrote a python script that loops through all of the CSVs from our public collections repository and indexed them in a local Elasticsearch server. From there, I started writing queries just to see what was possible. I was quickly able to come up with a lot of the functionality we already have on our site (full-text search, date range search) and get started with some complex queries as well (“most common medium in objects between 1990-2000,” for example, which is “paper”). This code is up on Github, so you can get started with your own Cooper Hewitt search engine at home!

Once I felt that I had a handle on how to index and query Elasticsearch, I got started building it into our site. I created a modified version of our Solr indexing script (in PHP) that copied objects, people, roles and media from MySQL and added them to Elasticsearch. Then I got started on the endpoint, which would take search parameters from a user and generate the appropriate query. The code for this would change a great deal as I worked on the frontend and occasionally refactored and abstracted pieces of functionality, but all the pieces of the pipeline were complete and I could begin rethinking the frontend.

Objective 2: Update the Frontend

Updating the frontend involved a few changes. Since we were now indexing multiple categories of things, there was still a case for keeping a per-category search view that gave users access to each field we have indexed. To accommodate these views, I added a tab bar across the top of the search forms, which defaults to the full-collection search. This also eliminates confusion as to what “fancy search” did as the search categories are now clearly labeled.

Showing the tabbed view for search options

The next challenge was how to display sorting. Previously, the drop-down menu containing sort options was hidden in a “filter these results” collapsible menu. I wanted to lay out all of the sorting options for the user to see at a glance and easily switch between sorting modes. Instead of placing them across the top in a container that would push the search results further down the page, I moved them to a sidebar which would also house search result facets (more on that soon). While it does cut in to our ability to display the pictures as big as we’d like, it’s the only way we can avoid hiding information from the user. Placing these options in a collapsible menu creates two problems: if the menu is collapsed by default, we’re basically ensuring that nobody will ever use them. If the menu is expanded by default, then it means that the actual results are no longer the most important thing on the page (which, on a search results page, they clearly are). The sidebar gives us room to lay out a lot of options in an unobtrusive but easily-accessible way².

Switching between sort mode and sort order.

The final challenge on the frontend was how to handle faceting. Faceting is a great way for users who know what they’re looking for to narrow down options, and a great way for users who don’t know what they’re looking for to be exposed to the various buckets we’re able to place objects in to.

Previously on our frontend, faceting was only available on fancy search. We displayed a few of the faceted fields across the top of the results page, and if you wanted further control, users could select individual fields to facet on using a drop-down menu at the bottom of the fancy search form. When they used this, though, the results page displayed only the facets, not the objects. In my updates, I’ve turned faceting on for all searches. They appear alongside the search results in the sidebar.

Relocating facets from across the top of the page to the sidebar

Doing it Live

We initially rolled these changes out about 10 days ago, though they were hidden from users who didn’t know the URL. This was to prove to ourselves that we could run Elasticsearch and Solr alongside each other without the whole site blowing up. We’re still using Solr for a bit more than just the search (for example, to show which people have worked with a given person), so until we migrate completely to Elasticsearch, we need to have both running in parallel.

A few days later, I flipped the switch to make Elasticsearch the default search provider and passed the link around internally to get some feedback from the rest of the museum. The feedback I got was important not just for working out the initial bugs and kinks, but also (and especially for myself as a relative newbie to the museum world) to help me get the language right and consider all the different expectations users might have when searching our collection. This resulted in some tweaks to the layout and copy, and some added functionality, but mostly it will inform my bigger-picture design decisions going forward.

A Few Numbers…

Improving performance wasn’t a primary objective in our changes to search, but we got some speed boosts nonetheless.

Query	Before (Solr)	After (Elasticsearch)
query=cat, facets on	162 results in 1240-1350ms	167 results in 450-500ms
year_acquired=gt1990, facets on	13,850 results in 1430-1560ms	14,369 results in 870-880ms
department_id=35347493&period_id=35417101, facets on	1,094 results in 1530-1580ms	1,150 results in 960-990ms

There are also cases where queries that turned up nothing before now produce relevant results, like “red concert poster,” (0 -> 11 results) “German drawings” (0 -> 101 results) and “checkered Girard samples” (0 -> 10 results).

Next Steps

Getting the improved search in front of users is the top priority now – that means you! We’re very interested in hearing about any issues, suggestions or general feedback that you might have — leave them in the comments or tweet us @cooperhewittlab.

I’m also excited about integrating some more exiting search features — things like type-ahead search and related search suggestion — on to the site in the future. Additionally, figuring out how to let users make super-specific queries (like the aforementioned “most common medium in objects between 1990-2000”) is a challenge that will require a lot of experimentation and testing, but it’s definitely an ability we want to put in the hands of our users in the future.

New Search is live on our site right now – go check it out!

¹ We’ve been struggling to find a word to use for things that are “first-class” in our collection (objects, people, countries, media etc.) that makes sense to both museum-folk and the laypeople. We can’t use “objects” because those already refer to a thing that might go on display in the museum. We’ve also tried “items,” “types” and “isas” (as in, “what is this? it is a person”). But nothing seems to fit the bill.

² We’re not in complete agreement here at the labs over the use of a sidebar to solve this design problem, but we’re going to leave it on for a while and see how it fares with time. Feedback is requested!

The Medium is the Message (and pubsocketd)

Robot Rothko

5 Replies

Now that I’ve written this blog post it occurs to me that it would be trivial to build something similar on top of the Cooper Hewitt Collections API — since that’s ultimately where all this colour stuff comes from — so I will probably do that shortly and stick in it the Play section.

That’s something I wrote last week on my personal weblog. I was writing about a little web “application” that I’d made to generate algorithmic “multiforms” that recall the work of the late painter Mark Rothko. The source of the colors used to create these robot-multiforms are derived from photo uploads and extracted using the same code that the Cooper Hewitt uses to generate color palettes for the objects in our collection. We wrote about that process last year.

These robot “paintings” are built by fetching three photos and using their primary color to fill one of three stacked rectangles that make up the canvas. A dominant color for a fourth photo is used along with an inset CSS3 box-shadow to give the illusion a fuzzy, hazy background on which the rectangles sit. Every 60 seconds a new version is generated and the colors (and boxes) gently transition from old to new.

In that original blog post, I also wrote:

That’s it. It doesn’t do anything else and that’s part of the charm for me. It just sits in the background running in second-screen-mode stamping out robot-Rothko paintings. … It’s nice to have a new screen friend to spend the days the days with

They’re not really Rothko paintings, obviously, and to suggest that they are would do the painter a disservice. Rothko’s paintings are not just any random set of colors stacked on top of one another. Rothko worked long and hard to choose the arrangement of his paintings and it’s easy to imagine that he would have been horrified by some of the combinations that Robot Rothko offers up. But like the experimental Albers Boxes feature they are a nod and gesture – and a wink – towards the real thing.

Having gotten things working for a personal non-museum and not-really-for-strangers project I decided that it would be nice to do something similar for for the museum which is absolutely for everyone. So, today we are launching Robot Rothko which is exactly the same as the application described above except that it uses objects from our collection instead of photos as its source material. Like this:

https://collection.cooperhewitt.org/play/robot-rothko/#info

See the #info part of that URL? That will cause the application to load with an information box that explaining what you’re looking at (and that will close itself automatically after 30 seconds). If you just want to jump straight to the application all you have to do is remove the #info from the URL.

https://collection.cooperhewitt.org/play/robot-rothko/

Robot Rothko will automatically update itself using random object records to create a new multiform every 60 seconds. Mouse over any color to see the object it represents. Click on the text to see our collection record for the object itself.

You can also filter stuff by person, decade. You can also filter by the year we acquired an object if you can guess where it is; that one still feels a little buggy so we’re going to hold off publishing the URL until we can figure out what’s wrong. Here are some examples of the first two:

https://collection.cooperhewitt.org/play/robot-rothko/people/18046041

https://collection.cooperhewitt.org/play/robot-rothko/decade/1910

Robot Rothko is native to the web which means it will work in any modern web browser whether it’s on your desktop or your phone or your tablet. It can be put it to fullscreen mode (by pressing shift-F) and if you save the website’s URL to your homescreen on your phone, or tablet, it is configured to launch without any of the usual browser chrome. If you use a Mac you can plug the URL for Robot Rothko in to Todd Ditchendorf’s handy Fluid.app which will turn it all in to a shiny desktop application. I am guessing there are equivalent tools for Windows or Linux but I don’t know what they are.

20140707-robot-rothko-tablet

If you’d like to generate your own Robot Rothkos there’s an API method for doing just that:

https://collection.cooperhewitt.org/api/methods/cooperhewitt.play.robotRothko

And of course it works with our recently announced support for DSON as a response format:

curl -X GET 'https://api.collection.cooperhewitt.org/rest/?method=cooperhewitt.play.robotRothko&access_token=SEEKRET&person_id=18041501&format=dson'

such "rothko" is such "canvas" is so "49" and "28" and "23" many and "palette" is so such "colour" is "#b8ab5b" , "id" is "18805769" , "epitaph" is "Folding Fan, 1900u201305. Medium: silk, wood, horn, metal, metal spangles. Gift of Lillian C. Hart. 1985-89-1." wow ? such "colour" is "#c7c7c7" . "id" is "18640557" ! "epitaph" is "Drawing, "Two Studies for Rectangul", ca. 1965. Pen and black ink on white wove paper. Gift of Vladimir Kagan. 1992-56-7." wow , such "colour" is "#db8952" , "id" is "18133219" , "epitaph" is "Fragment, mid-18th century. Medium: silknTechnique: plain weave patterned by supplementary warp floats and complementary weft floats. Gift of John Pierpont Morgan. 1902-1-811." wow many ? "background" is such "colour" is "#c7a9af" . "id" is "18761047" ! "epitaph" is "Booklet Cover Sheet, 1916. Color woodcut on lavender wove paper paper. Museum purchase from Drawings and Prints Council Fund and through gift of Margery and Edgar Masinter and Merrill C. Berman. 1999-50-1-3." wow wow , "filters" is so many and "stat" is "ok" wow

Robot Rothko lives in a new section of the collections website called “Play“. The distinction between the Play section and the Experimental Features section of the website can probably be easiest thought of as: Experimental features are things that apply to the entirety of the collections website, while Play things are small contained applications that use the collections API and focus on or build off a particular aspect of the collection. The first of these was Sam Brenner’s SkyDesigner and Robot Rothko is actually the third such application.

In between those two was What Would Micah Say? (WWMS) a quick end-of-day project to test out the W3C’s Text-to-Speech APIs that are starting to appear in some web browsers (read: Chrome and Safari as of this writing, and make sure you have the volume turned up). The WWMS “application” was mostly a simple 20-minute exercise to test whether fetching some content dynamically and feeding to the text-to-speech APIs actually works and produces something useable. It does, which is very exciting because it opens up any number of accessibility-related improvements we can starting thinking about adding to the collections website.

That we happened to use the cooperhewitt.labs.whatWouldMicahSay API method and then configured the text-to-speech API to read his words as if spoken by a “French” robot made it all a little bit silly and a little more fun but those are important considerations. Because sometimes playing at – or making interesting – a technical problem is the best way to work through whether it is even worth pursuing in the first place.

Announcing SkyDesigner! Sam Brenner joins the Labs

2 Replies

Greetings readers! My name is Sam and I’m the new Interactive Media Developer here at the Cooper-Hewitt’s Digital and Emerging Media department. I’m thrilled to be here with the opportunity to help design and build the future of the museum, both online and in-house.

As part of my application for the position, I built SkyDesigner, a web application that lets users replace the color of the sky with a picture of a similarly-colored object from the Cooper-Hewitt’s collection. The “sky” idea comes from the original assignment, which was to create an application using both a weather API and the Cooper-Hewitt API, but you can use SkyDesigner to swap out colors from anything you can take a picture of (meaning, it’s great for selfies). Give it a try now!

687474703a2f2f7777772e73616d6a6272656e6e65722e636f6d2f70726f6a656374732f736b792f6c69622f696d672f30322e6a7067

Here’s how it works: first, users take a picture. If they’re on a computer, they can use their webcam. If they’re on a smartphone, they can use the built-in camera. Android users get (in my opinion) the better experience, because Android supports getUserMedia – this means that users can start their camera and take a picture without ever having to leave the application. iOS doesn’t support getUserMedia yet, so they are sent off to the native iOS camera app to take their picture, which then gets passed back to the browser. Once I receive the picture, I load it into a canvas.

In the next step, users tap on their picture to select a color. The color’s hex code is sent straight to the Cooper-Hewitt API’s search method, where I search for similarly-colored objects that have an associated image. While waiting for a response from the API, I also tell the canvas to make every pixel within range of the selected color become transparent. When I get the image back from the API, I load it in behind the canvas and presto! It shows through where the selected color used to be. Finally, the image is titled based on the object’s creator and your current weather information.

It’s built using HTML, CSS and JavaScript. The original application had PHP to talk to the API but that’s since been ported to JavaScript since I now have the luxury of running the site on the Collections website itself where we have our own built-in API hooks.

Being a weekend project, there are some missing features – sharing is a big one – but I think it demonstrates the API’s ability to provide fresh, novel ways into a museum’s vast collection. Here’s the link again, and you can also find the source on GitHub.

Video Capture for Collection Objects

6 Replies

Stepping inside a museum storage facility is a cool experience. Your usual gallery ambience (dramatic lighting, luxurious swaths of empty space, tidy labels that confidently explain all) is completely reversed. Fluorescent lights are overhead, keycode entry pads protect every door, and official ID badges are worn by every person you see. It’s like a hospital, but instead of patients there are 17th century nightgowns and Art Deco candelabras. Nestled into tiny, sterile beds of acid-free tissue paper and archival linen, the patients are occasionally woken and gently wheeled around for a state-of-the-art microscope scan, an elaborate chemical test, or a loving set of sutures.

A gloved, cardigan-ed museum worker pushing a rolling cart down a hallway of large white shelving units.

A rare peek inside the storage facility.

If you ask a staff member for an explanation of this or that object on the nearest cart or shelf, they might tell you a detailed story, or they might say that so far, not much is known. I like the element of unevenness in our knowledge, it’s very different from the uniform level of confidence one sees in a typical exhibition.

The web makes it possible to open this space to the public in all its unpolished glory – and many other museums have made significant inroads into new audiences by pulling back the curtain. The prospect is like catnip for the intellectually curious, but hemlock for most museum employees.

Typically, the only form of media that escapes this secretive storage facility are hi-res TIFFs artfully shot in an on-site photography studio. The seamless white backdrop and perfectly staged lighting, while beautiful and ideal for documentation, completely belie the working lab environment in which they were made.

We just launched a new video project called “Collections in Motion.” The idea is super simple: short videos that demonstrate collections objects that move, flip, click, fold, or have any moveable part.

Here are some of the underlying thoughts framing the project:

Still images don’t suffice for some objects. Many of them have moving parts, make sounds, have a sense of weight, etc that can’t be conveyed through images.
Our museum’s most popular videos on YouTube are all kinetic, kinda entrancing, moving objects. (Contour Craft 3D Printing, A Folding Bicycle, and a Pop-up Book, for example).
Videos played in the gallery generally don’t have sound or speakers available.
In research interviews with various types of visitors, many people said that they wouldn’t be interested in watching a long, involved video in a museum context.
Animated GIFs, 6-second Vines, and 15-second Instagram videos loom large in our contemporary visual/communication culture.
How might we think of the media we produce (videos, images, etc) as a part of an iterative process that we can learn from over time? Can we get comfortable with a lower quality but higher number of videos going out to the public, and seeing what sticks (through likes, comments, viewcount, etc)?

A screenshot from YouTube Analytics showing most popular videos: Contour Crafting, Folding Bicycle, Puss in Boots Pop-up book, et cetera

Our most popular YouTube videos for this quarter. They are all somewhat mesmerizing/cabinet-of-curiosity type things.

Here are some of the constraints on the project:

No budget (pairs nicely with the preceding bullet).
Moving collections objects is a conservation no-no. Every human touch, vibration and rub is bad for the long-long-longevity of the object (and not to mention the peace of mind of our conservators).
Conservators’ and curators’ time is in HIGH demand, especially as we get closer to our re-opening. They are busy writing new books, crafting wall labels, preparing gallery displays, etc. Finding a few hours to pull an object from storage and move it around on camera is a big challenge.

So, nerd world, what do you think?

Label Whisperer

8 Replies

Have you ever noticed the way people in museums always take pictures of object labels? On many levels it is the very definition of an exercise in futility. Despite all the good intentions I’m not sure how many people ever look at those photos again. They’re often blurry or shot on an angle and even when you can make out the information there aren’t a lot of avenues for that data to get back in to the museum when you’re not physically in the building. If anything I bet that data gets slowly and painfully typed in to a search engine and then… who knows what happens.

As of this writing the Cooper-Hewitt’s luxury and burden is that we are closed for renovations. We don’t even have labels for people to take pictures of, right now. As we think through what a museum label should do it’s worth remembering that cameras and in particular cameras on phones and the software for doing optical character recognition (OCR) have reached a kind of maturity where they are both fast and cheap and simple. They have, in effect, showed up at the party so it seems a bit rude not to introduce ourselves.

I mentioned that we’re still working on the design of our new labels. This means I’m not going to show them to you. It also means that it would be difficult to show you any of the work that follows in this blog post without tangible examples. So, the first thing we did was to add a could-play-a-wall-label-on-TV endpoint to each object on the collection website. Which is just fancy-talk for “another web page”.

Simply append /label to any object page and we’ll display a rough-and-ready version of what a label might look like and the kind of information it might contain. For example:

https://collection.cooperhewitt.org/objects/18680219/label/

Now that every object on the collection website has a virtual label we can write a simple print stylesheet that allows us to produce a physical prototype which mimics the look and feel and size (once I figure out what’s wrong with my CSS) of a finished label in the real world.

So far, so good. We have a system in place where we can work quickly to change the design of a “label” and test those changes on a large corpus of sample data (the collection) and a way to generate an analog representation since that’s what a wall label is.

Careful readers will note that some of these sample labels contain colour information for the object. These are just placeholders for now. As much as I would like to launch with this information it probably won’t make the cut for the re-opening.

Do you remember when I mentioned OCR software at the beginning of this blog post? OCR software has been around for years and its quality and cost and ease-of-use have run the gamut. One of those OCR application is Tesseract which began life in the labs at Hewlitt-Packard and has since found a home and an open source license at Google.

Tesseract is mostly a big bag of functions and libraries but it comes with a command-line application that you can use to pass it an image whose text you want to extract.

In our example below we also pass an argument called label. That’s the name of the file that Tesseract will write its output to. It will also add a .txt extension to the output file because… computers? These little details are worth suffering because when fed the image above this is what Tesseract produces:

$> tesseract label-napkin.jpg label
Tesseract Open Source OCR Engine v3.02.01 with Leptonica
$> cat label.txt
______________j________
Design for Textile: Napkins for La Fonda del
Sol Restaurant

Drawing, United States ca. 1959

________________________________________
Office of Herman Miller Furniture Company

Designed by Alexander Hayden Girard

Brush and watercolor on blueprint grid on white wove paper

______________._.._...___.___._______________________
chocolate, chocolate, sandy brown, tan

____________________..___.___________________________
Gift of Alexander H. Girard, 1969-165-327

I think this is exciting. I think this is exciting because Tesseract does a better than good enough job of parsing and extracting text that I can use that output to look for accession numbers. All the other elements in a wall label are sufficiently ambiguous or unstructured (not to mention potentially garbled by Tesseract’s robot eyes) that it’s not worth our time to try and derive any meaning from.

Conveniently, accession numbers are so unlike any other element on a wall label as to be almost instantly recognizable. If we can piggy-back on Tesseract to do the hard work of converting pixels in to words then it’s pretty easy to write custom code to look at that text and extract things that look like accession numbers. And the thing about an accession number is that it’s the identifier for the thing a person is looking at in the museum.

To test all of these ideas we built the simplest, dumbest HTTP pony server to receive photo uploads and return any text that Tesseract can extract. We’ll talk a little more about the server below but basically it has two endpoints: One for receiving photo uploads and another with a simple form that takes advantage of the fact that on lots of new phones the file upload form element on a website will trigger the phone’s camera.

This functionality is still early days but is also a pretty big deal. It means that the barrier to developing an idea or testing a theory and the barrier to participation is nothing more than the web browser on a phone. There are lots of reasons why a native application might be better suited or more interesting to a task but the time and effort required to write bespoke applications introduces so much hoop-jumping as to effectively make simple things impossible.

Given a simple upload form which triggers the camera and a submit button which sends the photo to a server we get back pretty much the same thing we saw when we ran Tesseract from the command line:

We upload a photo and the server returns the raw text that Tesseract extracts. In addition we do a little bit of work to examine the text for things that look like accession numbers. Everything is returned as a blob of data (JSON) which is left up to the webpage itself to display. When you get down to brass tacks this is really all that’s happening:

$> curl -X POST -F "file=@label-napkin.jpg" https://localhost | python -mjson.tool
{
    "possible": [
        "1969-165-327"
    ],
    "raw": "______________j________nDesign for Textile: Napkins for La Fonda delnSol RestaurantnnDrawing, United States ca. 1959nn________________________________________nOffice of Herman Miller Furniture CompanynnDesigned by Alexander Hayden GirardnnBrush and watercolor on blueprint grid on white wove papernn______________._.._...___.___._______________________nchocolate, chocolate, sandy brown, tannn____________________..___.___________________________nGift of Alexander H. Girard, 1969-165-327"
}

Do you notice the way, in the screenshot above, that in addition to displaying the accession number we are also showing the object’s title? That information is not being extracted by the “label-whisperer” service. Given the amount of noise produced by Tesseract it doesn’t seem worth the effort. Instead we are passing each accession number to the collections website’s OEmbed endpoint and using the response to display the object title.

Here’s a screenshot of the process in a plain old browser window with all the relevant bits, including the background calls across the network where the robots are talking to one another, highlighted.

Upload a photo
Extract the text in the photo and look for accession numbers
Display the accession number with a link to the object on the CH collection website
Use the extracted accession number to call the CH OEmbed endpoint for additional information about the object
Grab the object title from the (OEmbed) response and update the page

See the way the OEmbed response contains a link to an image for the object? See the way we’re not doing anything with that information? Yeah, that…

But we proved that it can be done and, start to finish, we proved it inside of a day.

It is brutally ugly and there are still many failure states but we can demonstrate that it’s possible to transit from an analog wall label to its digital representation on a person’s phone. Whether they simply bookmark that object or email it to a friend or fall in to the rabbit hole of life-long scholarly learning is left an as exercise to the reader. That is not for us to decide. Rather we have tangible evidence that there are ways for a museum to adapt to a world in which all of our visitors have super-powers — aka their “phones” — and to apply those lessons to the way we design the museum itself.

We have released all the code and documentation required build your own “label whisperer” under a BSD license but please understand that it is only a reference implementation, at best. A variation of the little Flask server we built might eventually be deployed to production but it is unlikely to ever be a public-facing thing as it is currently written.

https://github.com/cooperhewitt/label-whisperer/

We welcome any suggestions for improvements or fixes that you might have. One important thing to note is that while accession numbers are pretty straightforward there are variations and the code as it written today does not account for them. If nothing else we hope that by releasing the source code we can use it as a place to capture and preserve a catalog of patterns because life is too short to spend very much of it training robot eyes to recognize accession numbers.

The whole thing can be built without any external dependencies if you’re using Ubuntu 13.10 and if you’re not concerned with performance can be run off a single “micro” Amazon EC2 instance. The source code contains a handy setup script for installing all the required packages.

Immediate next steps for the project are to make the label-whisperer server hold hands with Micah’s Object Phone since being able to upload a photo as a text message would make all of this accessible to people with older phones and, old phone or new, requires users to press fewer buttons. Ongoing next steps are best described as “learning from and doing everything” talked about in the links below:

Michal Migurski’s Walking Papers and Walking Papers Cheaply
Astronomy.net’s Making the Sky Searchable
The Royal Observatory’s Introducing Astrotags — if you don’t bother following any of the other links at least watch this because it’s basically the best thing ever
Matt Jones’ Product Sketch: Clocks for Robots

Discuss!

"C" is for Chromecast: hacking digital signage

11 Replies

Since the late 1990s museums have been fighting a pointless war against the consumerization of technology. By the time the Playstation 2 was released in 2000, every science museum’s exhibition kiosk game looked, felt, and was, terribly out dated. The visitors had better hardware in their lounge rooms than museums could ever hope to have. And ever since the first iPhone hit the shelves in 2007, visitors to museums have also carried far better computing hardware in their pockets.

But what if that consumer hardware, ever dropping in price, could be adapted and quickly integrated into the museum itself?

With this in mind the Labs team took a look at the $35 Google Chromecast – a wifi-enabled, HDMI-connected networked media streaming playback system about the size of a USB key.

With new media-rich galleries being built at the museum and power and network ports in a historic building at a premium, We asked ourselves “could a Chromecast be used to deliver the functionality of digital signage system, but at the fraction of the cost”? Could some code be written to serve our needs and possibly those of thousands of small museums around the world as well?

Before we begin, let’s get some terms of reference and vocabulary out of the way. The first four are pretty straightforward:

Display – A TV or a monitor with an HDMI port.

Chromecast device – Sometimes called the “dongle”. The plastic thing that comes in a box and which you plug in to your monitor or display.

Chromecast application – This is a native application that you download from Google and which is used to pair the Chromecast device with your Wifi network.

Chrome and Chromecast extension – The Chrome web browser with the Chromecast extension installed.

That’s the most basic setup. Once all of those pieces are configured you can “throw” any webpage running in Chrome with the Chromecast extension on to the display with the Chromecast device. Here’s a picture of Dan Catt’s Flambientcam being thrown on to a small 7-inch display on my desk:

Okay! The next two terms of reference aren’t really that complicated, but their names are more conceptual than specific identifiers:

The “Sender” – This is a webpage that you load in Chrome and which can cause a custom web page/application (often called the “receiver”, but more on that below) to be loaded on to one or more the Chromecast device via a shared API.

The “Receiver” – This is also a webpage but more specifically it needs to be a living breathing URL somewhere on the same Internet that is shared by and can be loaded by a Chromecast device. And not just any URL can be loaded either. You need to have the URL in question whitelisted by Google. Once the URL has been approved you will be issued an application ID. That ID needs to be included in a little bit of Javascript in both the “sender” and the “receiver”.

There are a couple important things to keep in mind:

First, the “sender” application has super powers. It also needs to run on a machine with a running web browser and, more specifically, that web browser is the one with the super powers since it can send anything to any of the “displays”. So that pretty much means a dedicated machine that sits quietly in a locked room. The “sender” is just a plain vanilla webpage with some magic Google Javascript but that’s it.
Second, the “receiver” is a webpage that is being rendered on/by the Chromecast device. When you “throw” a webpage to a Chromecast device (like the picture of Dan’s Flambientcam above) the Chromecast extension is simply beaming the contents of the browser window to the display, by way of the Chromecast device, rather than causing the device to fetch and process data locally.

Since there’s no more way to talk at this webpage (the “sender”) because it’s running in a browser window that means we need a bridging server or a… “broker” which will relay communications between the webpage and other applications. You may be wondering “Wait… talk at the sender” or “Wait… other applications?” or just plain “…What?”

Don’t worry about that. It may seem strange and confusing but that’s because we haven’t told you exactly what we’re trying to do yet!

We’re trying to do something like this:

We’re trying to imagine a system where one dedicated machine running Chrome and the Chromecast extension that is configured to send messages and custom URLs for a variety of museum signage purposes to any number of displays throughout the museum. Additionally we want to allow a variety of standalone “clients” in such a way that they can receive information about what is being displayed on a given display and to send updates.

We want the front-of-house staff to be able to update the signage from anywhere in the museum using nothing more complicated than the web browser on their phone and we want the back-of-house staff to be able to create new content (sic) for those displays with nothing more complicated than a webpage.

That means we have a couple more names of things to keep track of:

The Broker – This is a simple socket.io server – a simple to use and elegant server that allows you do real-time communications between two or more parties – that both the “sender” and all the “clients” connect to. It is what allows the two to communicate with each other. It might be running on the same machine as a the Chrome browser or not. The socket.io server needn’t even be in the museum itself. Depending on how your network and your network security is configured you could even run this server offsite.

The Client – This is a super simple webpage that contains not much more than some Javascript code to connect to a “broker” and ask it for the list of available displays and available “screens” (things which can shown on a display) and controls for setting or updating a given display.

In the end you have a model where:

Some things are definitely in the museum (displays, Chromecast devices, the browser that loads the sender)
Some things are probably in the museum (the client applications used to update the displays (via the broker and the sender))
Some things that might be in the museum (the sender and receiver webpages themselves, the broker)

At least that’s the idea. We have a working prototype and are still trying to understand where the stress points are in the relationship between all the pieces. It’s true that we could just configure the “receiver” to connect to the “broker” and relay messages and screen content that way but then we need to enforce all the logic behind what can and can’t be shown, and by whom, in to the receiver itself. Which introduces extra complexity that become problematic to update easily across multiple displays and harder still to debug.

We prefer to keep the “sender” and “receiver” as simple as possible. The receiver is little more than an iframe which can load a URL and a footer which can display status messages and other updates. The sender itself is little more than a relay mechanism between the broker and the receiver.

All of the application logic to control the screens lives in the “broker” which is itself a node.js server. Right now the list of stuff (URLs) that can be sent to a display is hard-coded in the server code itself but eventually we will teach it to talk to the API exposed by the content management system that we’ll use to generate museum signage. Hopefully this enforces a nice clean separation of concerns and will make both develop and maintenance easier over time.

We’ve put all of this code up on our GitHub account and we encourage to try and it out and let us know where and when it doesn’t work and to contribute your fixes. (For example, careful readers will note the poor formatting of timestamps in some of the screenshots above…) — thanks to hugovk this particular bug has already been fixed! The code is available at:

https://github.com/cooperhewitt/chromecast-signage

This is a problem that all museums share and so we are hopeful that this can be the first step in developing a lightweight and cost-effective infrastructure to deploy dynamic museum signage.

This is what a simple “client” application running on a phone might look like. In this example we’ve just sent a webpage containing the schedule for nearby subway stations to a “device” named Maui Pinwale.

We haven’t built a tool that is ready to use “out of the box” yet. It probably still has some bugs and possibly even some faulty assumptions (in its architecture) but we think it’s an approach that is worth pursuing and so, in closing, it bears repeating that:

Cooper Hewitt Labs

Technology + Media + Experience