Author Archives: Micah Walter

About Micah Walter

Micah is the Director of Digital & Emerging Media.

Our new ticketing website

Screenshot 2014-12-16 16.25.03

This past week we launched — a new online ticketing system which leverages our Consitutent Relationship Management application, Tessitura, as its “source of truth.”

It’s a simple application, really. It lets you pick the day you want to come visit us, select the kind of tickets you want to buy, and then you fill out your basic info, plug in your credit card digits and off you go. Moments later you receive an email with PDF versions of your tickets attached.

On the user-facing side of things, it is designed to be as simple as possible. You don’t need to log in, there is no “shopping cart”, and above all, you can do all of this from your phone if you want to skip ahead of the lines this Winter on your first visit back since we closed nearly three years ago.

A little background

The idea to pre-book tickets online came at us from a number of directions. Some time last year we decided to invest in Tessitura to handle all of our CRM needs. Tessitura, if you have never heard of it before, is an enterprise class, battleship that grew out of the Met Opera House and has made its way around the performing arts sector. It’s a great tool if you are looking to centralize everything there is to know about a Constituent. As a museum, it is also appealing in that it does many of the things that non-profit type cultural institutions need to do out of the box.

So, Tessitura. It is now a thing at our museum. Everyone on our staff started ramping up on the software and getting settled into the idea of using Tessitura for one thing or another. Our department began to get requests.

Obviously our membership department would like to use Tessitura to sell and manage memberships. Development would like to use it to manage and collect donations and gifts. Education would like to centralize all public programs, book tours, manage special events, and all of the other crazy things they do. And did I mention we have a museum that sells tickets?

This is how it always starts. The avalanche of ideas, whiteboard sessions, product demos and gentle emails that say things like “When will Tessitura be ready?” begin to pour in. You have to soak it all in and then wring it all out.

The Simplest, Dumbest Thing

Aaron says that quite often. “Just do the simplest, dumbest thing…” and he’s right. Often times you have to boil things down a bit to get to the real core issues at hand. It was clear from the start that this would be an essential part of the “design process” on this project.

So, I started out by asking myself this question:  “What is the most basic thing we want to do with Tessitura?”

I wound up with two clear answers.

  1. We wanted to be able to sell tickets online. Just basic, general admission tickets. Nothing fancy yet.
  2. We eventually want to use Tessitura as our identity provider, and as a way to pair your ticket with the Pen. More on that towards the end…

Tackling Tickets

So to get started, I thought about the challenge of selling a ticket online. I looked at other sites I liked such as StubHub, EventBrite, and other venue websites that I knew used Tessitura like BAM, Jazz at Lincoln Center, and the 92Y. I did some research, I bought some tickets, and I asked all my friends who used these sites what they liked and disliked. Eventually I started to find my way gravitating towards the Eventbrite way of doing things. We have been using Eventbrite for a couple of years here at Cooper Hewitt, for the most part as a way to sell tickets to education programs and events.

To tell you the truth, Eventbrite has been a dream come true for us and our sales for these events, both paid and free, have been very good. So, what is Eventbrite doing right? Simply put they’ve made the process of purchasing a ticket to an event online stupidly simple.

I wanted to know more. So, I spent some time and slowly walked myself through the process of booking all kinds of things on Eventbrite. I tried to step through each page in the process, I tried to notice what kind of user feedback I got, and what sort of emails and notifications I got. I tried the same on mobile devices and through their iPhone app. Here are a few takeaways.

  1. You don’t need to register in order to purchase a ticket on Eventbrite.
  2. If you don’t register when making an initial purchase, you can register later and see your purchase history.
  3. As soon as you book your tickets, you get them in an email.
  4. The Thank You page is just as useful when you are logged out as when you are logged in.
  5. Most importantly, you can only buy one thing at a time. In other words, there is no idea of a Shopping Cart.

That last one was pretty huge. Most eCommerce sites are built around the idea that users put items in a cart and then “Checkout.” Eventbrite doesn’t do it this way. Instead, you simply pick the thing you are wanting to attend, select the kinds of tickets you want ( student, senior, etc ) and then put in your credit card info. Once you hit submit, you’ve paid for your tickets and your transaction is complete.

I felt this flow was incredibly powerful and probably one of the reasons Eventbrite was working so well for our education programs. There are simply less chances to change your mind, less confusion over what you are buying, and the end-to-end process of picking something out and paying for it is just so much smaller than the more traditional shopping cart experience.

I began to think of it kind of like the difference between getting your weekly groceries and just picking up a six pack. The behaviors are totally different because you are trying to accomplish two totally different tasks. One is very routine, requires a little creativity and some patience, and a willingness to wander around and “pick.” The other is a strategic strike, designed to get in and get out so you can get home and relax with a nice cold one.

The Eventbrite concept seemed like what we wanted. I had my simplest dumbest thing, and something to model it on.

Technical Challenges

With every new project comes some kind of technical challenge(s). Tessitura is a “new to us” application and our staff at Cooper Hewitt were clearly at the bottom left of a steep learning curve when we started the project. We also had many challenges we knew we were going to have to face because we are a “Governmental Institution.” So things like PCI compliance, complex network configurations, and security scans were all things I was going to need to learn about.

Tessitura comes with two APIs. One is a somewhat older ( as in the first thing they built ) SOAP API, and the other is a newer ( as in still under development ) REST API. Both allow data to get in and out of Tessitura in a variety of forms.

In addition to the standard SOAP and REST APIs, Tessitura has the facility to expose just about anything you can build into an MS SQL Stored Procedure through its API. This is an incredibly powerful feature, which can also be quite dangerous if you think about it.

When I attended the Tessitura Learning Conference & Convention this past summer, it became clear to me that many institutions that use Tessitura are building some kind of API wrapper, or some type of middleware that helps them make sense of it all. We chose to do the same. To accomplish this, I chose to model the API wrapper off our own Collections API, which is a REST-ish API based on Flamework, and uses oAuth2 for authentication. Having this API wrapper allows us to all speak the same language and use the same interface. It is also very, very similar to the Collections API, so among our own staff, it is pretty easy to navigate. The API wrapper, wraps methods from both the Tessitura SOAP and the REST APIs and presents a unified interface to both of them. It doesn’t implement every single API method, and it exposes “new” methods that we have custom built via those Stored Procedures I mentioned.

The Tickets website is a separate project that talks directly to the API. It is also a Flamework project, written primarily in PHP. It uses a MySQL database to store a small amount of local data, but for the most part it is making calls to the API wrapper, which in turn is making calls to either the SOAP or REST Tessitura APIs. Tessitura is the source of truth for most of the things the Ticketing website does.

The Front End

Screenshot 2014-12-16 16.25.44

The Tickets site from the user’s perspective is designed to be extremely simple. I worked with Sam ( our in house front end guru ) to build a responsive, and simple web application that does basically one thing, but the devil is always in the details.

At first glance, all the site does is allow you to select the day you want to come visit us, pick out what kinds of tickets you want, and then fill out your billing info and receive your tickets. It’s basically a calendar and a form and not much more. But like I said, the devil is in the details.

First, Sam built a beautiful calendar like one I’d never seen before. We talked at length about how dumb most website calendars were, and we tried to push things in a new direction. Our calendar starts out by showing you what you most likely are looking for–Today. It displays the next bunch of days up to two weeks worth by default, and if you are looking for a special date, it lets you drop down and navigate around until you find it. On mobile its slightly different in that it doesn’t show you any past dates ( why would you want to book the past? ) and it limits things a little so you’re not as overwhelmed by the interface. We call this “designing for context” and we thought that users might be using their mobile phones to buy a ticket online and jump up to the front of the queue.

Once you’ve selected the date you want, the app loads up the available tickets right below the calendar. You can easily change your mind and pick a different date. From here you just select the type and quantity of each ticket you want. Sam’s code does a bunch of front-end validations to make sure everything you are trying to do makes sense ( you can only purchase a youth ticket with another paid ticket for example ). Between the two of us, we try to do as much validation to what you are selecting as possible, both in Javascript on the front-end and in PHP on the back.

Once you hit Order Now, an order form is generated and displayed. I think its important to note here that nothing has really happened yet in Tessitura-land. We asked Tessitura for some details about the tickets you are interested in, but we haven’t “added them to your cart” or anything like that as of yet.

Screenshot 2014-12-16 16.28.34

You can then fill out your vitals. We ask you to give us your name, email, credit card details and billing address. We store all of this, with the exception of your credit card, in Tessitura. We make you an account, and at this point we send you an activation email which allows you to set up your password at your leisure. If all goes well with your credit card, we build your tickets ( I chose to do all this with FPDF rather than try and use Tessitura’s built in Print at Home server thing ) send them in an email, and then take you to a Thank You Page. You never had to register, or log in, and you technically never do. Your PDF tickets arrive in your inbox and that is technically all you need.

Screenshot 2014-12-16 16.27.35

As a little bonus, we just stick the barcode and some basic metadata about each ticket you bought on the Thank You Page so you can just present your phone at the door. This part is still a little rough and I chose to leave it that way for the time being so we can do some user research in the galleries. It’s a nice feature, but only time will tell if people actually use it or if it needs some finesse.

Screenshot 2014-12-16 16.26.50

Now that you are in the system, you can buy more tickets using the same email address and they will be connected to your same account, even if you still have never logged in. If you do choose to activate your account and login, you can look at your order history, and reprint your tickets if you’ve lost your email copy.

Screenshot 2014-12-16 16.27.10

Tessitura & The Pen

A while back in this blog post, I mentioned that we also wanted to use Tessitura as our identity provider and as a way to pair the ticket you’ve purchased with the pen we’ve handed you. This work is nearly done, but not yet in production. It will go live when our pen is available sometime in early 2015. But, the short story is, when you buy a ticket, either online or in person, we generate a special coded version of your ticket. This code gets paired up with the internal ID of the pen we gave you and that pairing gets stored in a database. What this all amounts to is that when you get home, and you want to see all the cool things you’ve done with your pen, you simply enter the code ( or go to a custom short URL ) on our website. We look up your pairing and are able to connect your Identity ( Tessitura ) with your Visit ( on the Collections site ). But that is all the topic of a future series of blog posts.


Now that the Tickets site is up and running, and we are watching the sales roll in, it’s easy to start thinking of more features and new ways to expand what the site can do. I’ve already started building simple admin tools and have been thinking about building a basic check-in app for off-site events. It’s too early to talk about all of the things we aim to do and how we plan to expand our online sales, but I’m hopeful that we will stay focused and narrow in our approach, offering our users the most elegant visitor experience possible. Or at the very least, the simplest, dumbest thing.

Downgrading your website (or why we are moving to WordPress)

Below are the slides and most of what I said at the 2014 Museums & The Web conference in Baltimore, Maryland.

“I believe that if we think first about people and then try, try, and try again to prototype our designs, we stand a good chance of creating innovative solutions that people will value and enjoy.” — Bill Moggridge


Let me begin by telling you a little story about a small museum that sat along 5th Ave. on New York’s Upper East Side. This is of course a largely fictional story. Names, and actual events have been changed.

This is the story of a little museum with big aspirations. Long ago this little museum had a website. It had a webmaster, and it published a blog. It even had a whole bunch of microsites, flash driven exhibition sites, event calendars and archives. In fact, it won a few Webby’s.

The website was very much the product of an organization trying to get the job done. And, it succeeded in this effort. Staff members would produce content on their company issued PCs and would then hand these documents off to the museum’s webmaster who would convert them into HTML and Javascript. The webmaster would press a specially designed “button” which would upload the new content to the little museum’s web servers where the pages would be served and maintained by a giant umbrella organization that had close ties to the government.
With a single webmaster managing the entirety of the museum’s web properties, the little staff of this museum faced an inevitability. It was just too much work for the webmaster to do alone. Even if they allowed the webmaster an apprentice, the workload would continue to grow, and the little museum’s website would suffer. Eventually, they all realized they would have to move towards a system that would allow the entire staff to collaborate more efficiently.

Eventually, they realized they would need a content management system.

There were many options out there already, and the little museum’s webmaster took stock in as many of them as he could. Meetings were had, and budgets were considered. The “committee to select a content management system” was formed, and consultants were brought in.

Wire frames were presented, and scopes of work were proposed, but the committee remained vigilant and put off making a decision as long as it could. They simply never felt like they had the right solution placed in front of them.

There was a lot at stake and many facets and bullet points drove them to a moment of indecision. There was due-dilligence due to their “mothership” in Washington, and there were “rights in data” clauses to be haggled over, with threats of time in a Federal prison always on everyone’s minds. Eventually the committee was disbanded and the project was put on hold.
Time went on and the little museum’s website continued to shine as the public face of the institution. It continued to be updated with more and more content, and eventually the little museum even invested a fair amount of money in putting their collections online for all to see.

The word on the street was that this little museum’s website was starting to blow up, more and more people were beginning to rely on it as source of good information, and the time had come to re-think the idea of re-building.
The webmaster at the little museum was doing his best, running around from staff member to staff member, trying to understand what had been going on all this time. One day he had the fortune to sit in on a meeting with a prominent weblogger and asked him a very important question.

“What CMS do you think we should chose” the webmaster said.

“CMS’s are all basically the same”, said the blogger, “just chose one you like and don’t look back.”

The webmaster took this to heart and selected three CMS systems that were free and easy to set up. He presented these to the higher ups and after a couple of hours of debate and one technical review board meeting, the webmaster had his answer.

Drupal would be the content management system for the little museum. Drupal.

The end, well sorta.

Most of that actually happened at the Cooper-Hewitt. The team eventually just had to pick a system, (without a whole lot of experience with the product itself) and kind of just “go for it.” From that point on, the staff at Cooper-Hewitt were living with Drupal. Drupal, a word almost none of the staff had ever heard before became, in less than a few months, a dirty word, spoken in fits of anger and dismay.

Now, before we go any further, it really needs to be said out loud that Drupal is really fine piece of software that has grown and evolved into a very sophisticated and well thought out framework for building websites. It has a rich community of developers and enthusiasts behind it and it powers some of the most popular websites on the planet. It’s used by giant companies far and wide, governments, and educational institutions all over. As well, our team in Washington has come a really long way in learning how to host and maintain Drupal based websites and presently, many of the latest Smithsonian websites are being built on Drupal. There is nothing intrinsically wrong with Drupal, we just realized, after a long time, it wasn’t for us.
I’m Micah Walter. I’m part of the nerd crew at Cooper-Hewitt. We are part of the Smithsonian ( that umbrella corporation in Washington )… and we are in the middle of a re-launch of our physical museum, as well as our digital presence.
Cooper-Hewitt started it’s life with a CMS by installing a copy of Drupal 6. Shortly thereafter, we installed some modules, and more modules, and more…modules. Eventually we had a pretty awesome website. We hired an engineering team to convert the look and feel of the old website into a Drupal theme, and we “went live.” Cooper-Hewitt was on a CMS and it felt good.


random extra slide

Some time during this process we sat down with all the staff members to show off our new CMS. We took them on a tour of the system and poked around with a few of the CMSs features, with the hopes of getting staffers excited about the whole thing. The staff seemed to respond positively, and after a couple of months of configuring Drupal’s permissions matrix, we gave out login details to a select number of “power users” around the museum. A few of these power users got it right away and were off and running, updating their existing webpages when they needed to. It wasn’t too bad actually. Staff could easily log in, search around for the relevant content and make minor changes to their pages. The problems started to appear when they wanted to do just slightly more. A staffer wasn’t able to easily upload an image to Drupal. The image had to first be sent to our graphics person who would convert it to a jpeg, resize it for the web and then it would be sent to the webmaster who would upload it to an Amazon S3 server. Once this was done the webmaster would email the URL to the image back to the staffer who would then try and figure out how to insert it into their page.

Another issue arose when staffers tried to author new pages. It was simply difficult for them to understand how the new page would find its way within the information architecture that was already in place. How were they to set the new page’s URL and menu items. Those kinds of tasks inevitably wound up back on the webmaster’s desk.
For the most part, notwithstanding a few hiccups here and there, Drupal 6 ran pretty smoothly. Staffers were able to distribute the workload a little more than they used to, and that was considered a good thing. But, about a year into it, a grant became available and the notion of running a daily blog about our objects turned into a reality. Object of the Day was born, and we had our work cut out for us.
Object of the Day went through many stages of evolution, eventually winding up as an institutional blog authored by staffers, students in our Masters program, docents, and even teens and high school kids interested in design. Every day another object from the collection was chosen and a post was written about it and published to our blog. Great pains were taken to ensure we considered the collection record, tags, the authors vitals and more. We met in committee meetings over and over and eventually worked out a plan to allow us to manage project. The end result would be a new post about a different object, every day.

In the beginning we toyed around with the idea of Object of the Day being run on a separate platform. We considered Tumblr, and even Blogger. But in the end, we decided we would put our new CMS to the test and put ourselves through the process of managing a daily blog with Drupal.
To accomplish this, the digital team realized we’d probably be wise to migrate to Drupal 7 in order to take advantage of its much improved back end user interface. So, with Object of the Day as catalyst, we moved ahead with plans to migrate our Drupal installation to D7. Consultants were hired, interns were enslaved and the whole process took just a few months. In the end we wound up with a fresh installation of Drupal 7, and about 20 or so contributed modules.
In parallel to this migration project we began to meet with staff members and work out the details of how this Object of the Day project would go down. We discussed a variety of organizational schemes, we talked about available resources, and how far the grant money might take us. In the end we came up with a pretty simple plan. Each month, one staff member would be the “editor” for Object of the Day. He or She would be responsible for collecting all the entries for the month, making sure they were entered into Drupal, edited and fact checked. They would then get scheduled to be published automatically on their specific day. This included many spreadsheets, checklists and meetings. It was of course, great user research for me and my team.

Once we had D7 up and running staff members started to get the hang of it. They started logging in and authoring content. And then the problems started to happen.
We already had about 1500 pages ( Drupal calls these nodes ) in the CMS. They were mostly static web pages about one program or another, or blog posts from the old days, or exhibition archives and other kinds of historic content. This was just fine as that content rarely got touched or updated. It was also fine when we wanted to add a fresh blog post or a new static page every once in a while…

The problem though was what happened when the monthly Object of the Day editor had to log in to start work on their thirty some posts for the upcoming month. It was nearly impossible for them to collect all the posts in one place within the CMS so that they could see what had been entered, what was finalized and what was ultimately scheduled. This was a major first hiccup and the digital team worked out a solution involving a number of custom Drupal views that would allow the editors to more easily see what they were working on. It kind of worked, but we could tell that it was a hack solution to a real problem.

The end result was, they lived with it. They lived with the system, learned to hate it, and just didn’t talk about it much. Drupal became this beast that they just came to terms with.
Time went one, and we all learned to work with Drupal. Many of the staff members became proficient enough to get by, and the calls to the webmaster desk lessened. But, the problems hadn’t gone away. In fact our little experiment to try and get staff members excited about authoring content on the web had actually backfired. Now, staff members authored content for Object of the Day because it was part of their job, listed in their work plans and reviewed during their performance evaluations at the end of each year. They hated it.

Meanwhile, Object of the Day took off. The public facing version of the blog became a big success. It received additional funding for a second year with the idea around the Sr. Management table being that it would go on forever. It was for a time our most popular page on the site.
If there is one truth we have learned about maintaining a website using a CMS its that you’ll eventually jump ship and switch to something else. In fact you may do this operation again and again. Its just the nature of the beast—the grass is always greener.

When we realized we needed to jump ship, we took to heart all the feedback we got from our content creators. We realized that what they really wanted were pleasant, easy to work with tools that allowed them to feel empowered. Tools that gave them a sense of authority, and made them feel good about the work they were doing. Like it was a way for them to communicate with the world all the important things they had going on.

In the end we chose WordPress. We looked at lots of options. We thought about even simpler options like a static site generator, or hmm, Squarespace? Could a museum run their entire website on Tumblr? All of these options afforded us with a great user experience, but seemed to trade of the ability to be flexible enough for our institutions needs. It really depends on the needs of each institution.

We searched far and wide. But we kept coming back to WordPress. It was familiar to everyone. Many of the staff already had their own WordPress blogs. WordPress gave us a nice balance between having the ability to create a sophisticated website and also being simple enough to use. In fact, while I was writing tools to migrate our content to WordPress, we realized that its more simplified system allowed us to re-organize our content, making the site easier to navigate. It’s not that we couldn’t do this in Drupal, but over time, Drupal just got out of control, because it let us.
We realized through the Object of the Day project that it was our CMS that stood in the way of success. The content was already good, the audience was already there. We just needed a way to get our own staff excited about doing it. It shouldn’t be hard. It should be really easy and really fun to do. WordPress lets our staff get excited about the work they are doing. It gives them a simple to use, enjoyable writing experience, and for the editors, we found some really great plugins that let them manage all the content without feeling overwhelmed. Thats really why we chose WordPress.

We kind of think of it as a downgrade on the technical side of things, but its definitely an upgrade when it comes to usability.

The end.


There was some good discussion following the talk. A few things of note that were brought up included how our staff already had some experience with WordPress via our website, our use of EditFlow for notifications and calendaring/scheduling of content and Pressbooks to aid with the production of our eBooks.

We also talked a little about hindsight…



Voices on our blog! A new Labs experiment


I have recently been experimenting with a new service on the web called SpokenLayer. SpokenLayer offers a network of on demand “human voices” who are ready to voice content on the web. SpokenLayer works completely behind the scenes and in an on-demand kind of way. As new page requests come in, SpokenLayer adds them to your queue. Then, the content is sent to SpokenLayer’s Mechanical-Turk-like network of recording artists who voice your content in small recording studios around the world. New sound recordings are then automatically added to your site via a simple Javascript snippet.

There are many possibilities for how Cooper-Hewitt might utilize this kind of a service. We see this as a useful way of experimenting with the podcast-ability of our content, bringing our content to a wider audience and allowing for better accessibility to those who need it. It also works nicely when you are on the go, and I am really eager to figure out how we might connect this up to our own collections API.

For a first pass we’ve decided to try it out on the Object of the Day blog. From now on, you will notice a small audio player at the top of each Object of the Day post. Click the play button on the left and you will be able to hear the “voiced” version of each day’s post ( be sure to turn on your computer’s speakers ). It usually takes anywhere from a half hour to a day for a new audio recording to appear.

I thought this one was particularly good as the recording artist was able to do a pretty good job with some of the Dutch accented words in the text.

It is an experiment and we’ll see how it goes. Here are a few examples you can listen to right now.

Default Sort, or what would Shannon do?

Claude Shannon

Up until recently our collections website displayed search results ordered by, well, nothing in particular. This wasn’t necessarily by design, we just didn’t have any idea of how we should sort the results. We tossed around the idea of sorting things by date or alphabet, but this seemed kind of arbitrary. And as search results get more complex, ‘keyword frequency’ isn’t necessarily equivalent to ‘relevance’.

We added the ability on our ‘fancy search‘ to sort by all of these things, but we still needed a way to present search results by default.

Enter Claude Shannon and a bit of high level math (or ‘maths’ if, like some of the team, are from this thing called ‘the Commonwealth’).

Claude Shannon was a pretty smart guy and back in 1948 in a paper titled “A Mathematical Theory of Communication” he presented the idea of Entropy, or information theory. The concept is actually rather simple, and relies on a quick analysis of a dataset to discover the probability of different parts of data within the set.

For images you can think about it by looking at a histogram and thinking of the height of each bar in the histogram as a representation of the probability that a particular pixel value will be present. With this in mind you can get a sense of how “complex” an image is. Images with really flat histograms ( lots of pixel values present lots of times ) will have a very high Entropy, where as images with severely spiked histograms ( all black or all white for example ) will have a very low Entropy.

In other words, images with more fine detail have a higher Entropy and are more complicated to express, and usually take up more room on disk when compressed.

Sidewall, a-w-793, 193040

Think of an image of a wallpaper pattern like this one. It has a really high Entropy value because within the image there is lots of fine detail and texture. If we look at the histogram for the image we can see that there are lots of pixel values represented pretty evenly across the graph with a few spikes in the middle most likely representing the overall palette of the image.

Screen Shot 2013-06-21 at 10.24.25 AM

On the other hand, check out this image of a pretty smooth vase on a white background. The histogram for this image is less evenly distributed, leaning towards the right of the graph and thus has a much lower Entropy value.

Vase, 2010-6-3, 188388

Screen Shot 2013-06-21 at 10.36.42 AMWe thought it might be interesting to sort all of the images in our collection by Entropy, displaying the more complex and finer detailed images first, so I built a simple python script that takes an image as input and returns its “Shannon Entropy” as a float.

To chew through the entire collection we built this into a simple “httpony” and built a background task to run through every image in our collection and add its Shannon Entropy as a value in the collection database. We then indexed these values in Solr and added the option to sort by “image complexity” in our Fancy Search page.

Screen Shot 2013-06-21 at 10.41.49 AM

Sorting by Shannon Entropy is kind of interesting, and we noticed right away that a small byproduct of this process is that objects that simply dont have an image wind up at the end of the sort. In the end we liked the search results so much that we made “image complexity” the default sort across the entire website. You can always go into Fancy Search and change the sort criteria to your liking, but we thought image complexity seemed to be a pretty good place to start.

But what is the relationship between Claude Shannon and Shannen Doherty? Well, it looks like Shannen, herself, has a very high Shannon Entropy…



Screen Shot 2013-06-21 at 10.58.46 AM

Welcome to object phone. Your call has been placed in a queue.

I made another small thing. Again, another way for me to experiment with the Collection API, and again, another way to experiment with new ways of accessing the collection. This time, there aren’t many screen shots to display–there is no website to look at. This time, it’s “Welcome to object phone!”

(718) 213-4915

Object Phone” is ( presently ) a very, very simple implementation of a way to explore our collection by dialing a telephone, or sending a text message. I had been thinking of a few of the more popular museum oriented audio tour products, and how they all seem to be very CMS style in their design, and wondering if we could just use our own API.

For example, TourML and TAP ( which offer the web programmer a very powerful framework for programming a mobile guide using the Drupal CMS ) are very nice, but they are still very dependent on content production. The developer or content manager has to build and curate all of the content for the “tour.” This might be a good way to go about things, especially if you are leaning on an existing Drupal installation for a good deal of your content, but I was looking for a way to access existing data, and specifically the data in our collection website.

In the beginning of developing our collection website, we went through the process of assigning EVERYTHING a unique “bigint” in the form of what we are referring to as an “artisinal integer.” This means that each object record, each person record and each, well, everything else has a unique integer which no other thing can have. This is not in place of accession numbers–we will probably always have accession numbers The nice thing about unique integers is that they’re really easy to deal with on a programmatic level.

For example, if you text 18704235 to 718-213-4915 you should get a response that looks like the screenshot below. In fact you can text any object id number from our collection and get a similar response.

2013-04-18 10.15.18

You can also dial that same number and use your keypad to either search the collection by object ID, or ask for a random object. The application will respond to you using a text to speech converter, which is usually pretty good.

Presently, the app is not replying with a whole lot of information. You essentially get the object’s title and medium field if it has one. In many cases, asking for a random object may just result in something like “Drawing.” Many of our object records don’t have much more useful information than this, and also, I am trying to wrangle with the idea of how much information is useful in a voice and text message ( with a 160 character limit per SMS).

The whole system is leveraging the Twilio service and API. Twilio offers quite a range of possibilities, and I am very excited to experiment with more. For example, instead of text to speech, Twilio can play back .wav files. Additionally, Twilio can do things like dial another phone number, forward calls and record the caller’s voice. There are so many possibilities here that I wont even begin to list them, but for example, I could easily see us using this to capture user feedback in our galleries by phone and text.

I’m very interested in figuring out a way to search by voice. I’m sort of dreaming of programming the thing to go “Why don’t you just tell me the object number!” as in this great episode of Seinfeld which you can watch by clicking the image below.

Screen Shot 2013-04-18 at 10.35.01 AMIf you are interested, I have also made the code public on this Gist. It’s pretty messy and redundant right now, but you’ll get the idea.

One of the more complicated aspects of this project will be designing the phone interface so it makes sense. Currently, once you hear an object play back, the system just hangs up on you. It would be nice to offer the user a better way to manipulate the system which is still pleasant and easy to understand. By that same token, there is a completely different approach that is needed for the SMS end of things as you don’t really have a menu tree, but instead of list of possible commands the user need to learn. Fortunately, there is a ton of great work that has already been accomplished in this arena, specifically by the Walker Art Center’s very long running and very yellow website Art on Call.

Source code at

Introducing the Albers API method

Screen Shot 2013-02-06 at 12.11.51 PM

We recently added a method to our Collection API which allows you to get any object’s “Albers” color codes. This is a pretty straightforward method where you pass the API an object ID, and it returns to you a triple of color values in hex format.

As an experiment, I thought it would be fun to write a short script which uses our API to grab a random object, grab its Albers colors, and then use that info to build an Albers inspired image. So here goes.

For this project I chose to work in Python as I already have some experience with it, and I know it has a decent imaging library. I started by using pycurl to authenticate with our API storing the result in a buffer and then using simplejson to parse the results. This first step grabs a random object using the getRandom API method.


buf = cStringIO.StringIO()

c = pycurl.Curl()
c.setopt(c.URL, '')
d = {'method':'cooperhewitt.objects.getRandom','access_token':api_token}

c.setopt(c.WRITEFUNCTION, buf.write)

c.setopt(c.POSTFIELDS, urllib.urlencode(d) )

random = json.loads(buf.getvalue())


object_id = random.get('object', [])
object_id = object_id.get('id', [])

print object_id

I then use the object ID I got back to ask for the Albers color codes. The getAlbers API method returns the hex color value and ID number for each “ring.” This is kind of interesting because not only do I know the color value, but I know what it refers to in our collection ( period_id, type_id, and department_id ).

d = {'method':'cooperhewitt.objects.getAlbers','id':object_id ,'access_token':api_token}

c.setopt(c.POSTFIELDS, urllib.urlencode(d) )

albers = json.loads(buf.getvalue())

rings = albers.get('rings',[])
ring1color = rings[0]['hex_color']
ring2color = rings[1]['hex_color']
ring3color = rings[2]['hex_color']

print ring1color, ring2color, ring3color


Now that I have the ring colors I can build my image. To do this, I chose to follow the same pattern of concentric rings that Aaron talks about in this post, introducing the Albers images as a visual language on our collections website. However, to make things a little interesting, I chose to add some randomness to the the size and position of each ring. Building the image in python was pretty easy using the ImageDraw module

size = (1000,1000)
im ='RGB', size, ring1color)
draw = ImageDraw.Draw(im)

ring2coordinates = ( randint(50,100), randint(50,100) , randint(900, 950), randint(900,950))

print ring2coordinates

ring3coordinates = ( randint(ring2coordinates[0]+50, ring2coordinates[0]+100) , randint(ring2coordinates[1]+50, ring2coordinates[1]+100) ,  randint(ring2coordinates[2]-200, ring2coordinates[2]-50) , randint(ring2coordinates[3]-200, ring2coordinates[3]-50) )

print ring3coordinates

draw.rectangle(ring2coordinates, fill=ring2color)
draw.rectangle(ring3coordinates, fill=ring3color)

del draw'file.png', 'PNG')

The result are images that look like the one below, saved to my local disk. If you’d like to grab a copy of the full working python script for this, please check out this Gist.


A bunch of Albers images

So, what can you humanities hackers do with it?

Curatorial Poetry

Curatorial Poetry

I made a fun thing…yesterday. In my 20% time, downtime, er.. 2am time, I decided to build a simple tumblr blog called Curatorial Poetry. I was inspired by Aaron’s take on our collection data and how he chose to present objects in our collection that have no image, but have a “description.” In the office we often have fun reading these aloud, or better, with Apple’s screen reader.

But, I thought it would be fun to “reblog” these in another form. So, I built a simple python script to do just that. To do this, I forked our collection data and wrote a short tool to convert our JSON objects into an sqlite3 database. I chose sqlite3 because, well, it’s light, and doesn’t require me to set up a DB server or anything.

Next, I spent most of my time trying to learn how oAuth2 works. It took me a good bit of time googling around before I realized that the python-oath2 library includes oAuth1 ( which tumblr uses ). All I really needed to do with the tumblr API was to create the post. Once I had my keys worked out and authenticating, it was just one line of code.

Once the post is published, the script updates my sqlite3 db so it makes sure to not post the same thing twice. Thats all!

I’d like to expand the code for this a bit to add some error checking, build in connections to our own API ( instead of using the data dump ) and connect with Twitter. I’m also interested in adding other museum’s data. We have the IMA data available on GitHub, but they don’t include “description” text, so well see… In the meantime, follow it and you’ll receive a new “poem” in your tumblr feed every two hours, for the next 8 years!

Getting lost in the collection (alpha)

Last week marked a pretty significant moment in my career here at Cooper-Hewitt.

As I’m sure most of you already know, we launched our Alpha collections website. The irony of this being an “alpha” of course is that it is by leaps and bounds better than our previous offering built on eMuseum.

If you notice in the screengrab below, eMuseum was pretty bland. The homepage, which is still available allowed you to engage by selecting from one of 4 museum oriented departments. You could also search. Right…

Upon entering the site, either via search or through browsing one of the four departments several things were in my mind huge problems.

Above is a search for “lennon” and you get the idea. Note the crazy long URLs with all kinds of session specific data. This was for years a huge issue as people would typically copy and paste that URL to use in blog posts and tweets. Trying to come back to the same URL twice never worked, so at some point we added a little “permlink” link at the bottom, but users rarely found it. You’ll also note the six search options under the menu item “Search.” OK, it’s just confusing.

Finally landing on an object page and you have the object data, but where does it all lead you to?

For me the key to a great, deep, user experience is to allow users to get lost within the site. It sounds odd at first. I mean if you are spending your precious time doing research on our site, you wouldn’t really want to “get lost” but in practice, it’s how we make connections, discover the oddities we never knew existed, and actually allow ourselves to uncover our own original thought. Trust me, getting lost is essential. (And it is exactly what we know visitors enjoy doing inside exhibitions.)

As you can probably tell by now, getting lost on our old eMuseum was pretty tough to do. It was designed to do just the opposite. It was designed to help you find “an object.”

Here’s what happens when you try to leave the object page in the old site.

So we ditched all that. Yes, I said that right, we ditched eMuseum! To tell you the truth, this has been something I have been waiting to do since I started here over two years ago.

When I started here, we had about 1000 objects in eMuseum. Later we upped that to 10,000, and when Seb began (at the end of 2011) we quickly upped that number to about 123,000.

I really love doing things in orders of magnitude.

I noticed that adding more and more objects to eMuseum didn’t really slow it down, it just made it really tough to browse and get lost. There was just too much stuff in one place. There were no other entry points aside from searching and clicking lists of the four departments.

Introducing our new collections website.

We decided to start from scratch, pulling data from our existing TMS database. This is the same data we were exporting to eMuseum, and the same data we released as CC0 on GitHub back in February. The difference would be, presenting the data in new ways.

Note the many menu options. These will change over time, but immediately you can browse the collection by some high level categories.

Note the random button–refreshing your browser displays three random objects form our collection. This is super-fun and to our surprise has been one of the most talked about features of the whole alpha release. We probably could have added a random button to eMuseum and called it a day, but we like to aim a little higher.

Search is still there, where it should be. So let’s do a similar search for “lennon” and see what happens.

Here, I’ve skipped the search results page, but I will mention, there were a few more results. Our new object page for this same poster by Richard Avedon is located at the nice, friendly and persistent URL It has its own unique ID (more on this in a post next week by Aaron) and I have to say, looks pretty simple. We have lots of other kinds of URLs with things in them like “people” , “places” and “periods.” URLs are important, and we intend to ensure that these live on forever. So, go ahead and link away.

The basic page layout is pretty similar to eMuseum, at first. You have essential viatal stats in the gray box on the right. Object ID, Accession Number, Tombstone data, etc, etc. You also have a map, and some helpful hints.

But then things get a little more exciting. We pull in loads of TMS data, crunch it and link it up to all sorts of things. Take a scroll down this page and you’ll see lots of text, with lots of links and a few fun bits at the end, like our “machine tag” field.

Each object has been assigned a machine tag based on its unique ID number. This is simple and straightforward future-proofing. If you’re on a site like Flickr and you come across (or have your own) photo of the same object we have in our collection, you can add the machine tag.

Some day in the near future we will write code to pull in content tagged in this way from a variety of sources. This is where the collection site will really begin to take shape as well not only be displaying the “thing we have” but its relevance in the world.

It puts a whole new spin on the concept of “collecting” and it’s something we are all very excited to see happen. So start tagging!

Moving on, I mentioned that you can get lost. This is nice.

From the Avedon poster page I clicked on the decade called 1960’s This brings me to a place where I can browse based on the decade. You can jump ahead in time and easily get lost. It’s so interesting to connect the object you are currently looking at to others from the same time period. You immediately get the sense that design happens all over the world in a wide variety of ways. It’s pretty addictive.

Navigating to the “person” page for Richard Avedon, we begin to see how these connections can extend beyond our own institutional research. We begin by pointing out what kinds of things we have by Avedon. This is pretty straight-forward, but in the gray box on the right you can see we have also linked up Avedon’s person record in our own TMS database with a wide variety of external datasets. For Avedon we have concordances with Freebase, MoMA, the V & A, and of course Wikipedia. In fact, we are pulling in Wikipedia text directly to the page.

In future releases we will connect with more and more on the web. I mean, thats the whole point of the web, right? If you want to help us make these connections ( we cant do everything in code ) feel free to fork our concordances repository on GitHub and submit a pull request.

We have many categorical methods of browsing our collection now. But I have to say, my favorite is still the Random images on the home page. In fact I started a Pinterest board called “Random Button” to capture a few of my favorites. There are fun things, famous things, odd things and downright ugly things, formerly hidden away from sight, but now easy to discover, serendipitously via the random button!

There is much more to talk about, but I’ll stop here for now. Aaron, the lead developer on the project is working on a much more in depth and technical post about how he engineered the site and was able to bring us to a public alpha in less than three months!

So stay tuned…. and get exploring.

People playing with collections #14: collection data on Many Eyes

Many Eyes Website

Many Eyes Website

I love seeing examples of uses of our collection metadata in the wild. bartdavis has uploaded our data to Many Eyes and created a few visualizations.

I found it interesting to see how many “matchsafes” we have in the collection, as you can easily see in the “color blindness test” inspired bubble chart! Here are a few screen grabs, but check them out for yourself at

Of interest to us, too, is that these visualisations are only possible because we released the collection data as a single dump. If we had, like many museums, only provided an API, this would not have been possible (or at least been much more difficult) to do.

Bubble chart of object types

Bubble chart of object types

Number of objects by century

Number of objects by century

Word cloud of object types

Word cloud of object types

Totally cached out

We do a good deal of cacheing on our web properties here at Cooper-Hewitt.

Our web host, PHPFog adds a layer of cacheing for free known as Varnish cache. Varnish Cache sits in front of our web servers and performs what is known as reverse proxy cacheing. This type of cacheing is incredibly important as it adds the ability to quickly serve cached files to users on the Internet vs. continually recreating dynamic web-pages by making calls into the database.

For static assets such as images, javascripts, and css files, we turn to Amazon’s CloudFront CDN. This type of technology ( which I’ve mentioned in a number of other posts here ) places these static assets on a distributed network of “edge” locations around the world, allowing quicker access to these assets geographically speaking, and as well, it removes a good deal of burden from our application servers.

However, to go a bit further, we thought of utilizing memcache. Memcache is an in-memory database key-value type cacheing application. It helps to speed up calls to the database by storing as much of that information in memory as possible. This has been proven to be extremely effective across many gigantic, database intensive websites like Facebook, Twitter, Tumblr, and Pinterest ( to name just a few ). Check this interesting post on scaling memcached at Facebook.

To get started with memcache I turned to Amazon’s Elasticache offering. Elasticache is essentially a managed memcache server. It allows you to spin up a memcache cluster in a minute or two, and is super easy to use. In fact, you could easily provision a terabyte of memcache in the same amount of time. There is no installation, configuration or maintenance to worry about. Once your memcache cluster is up and running you can easily add or remove nodes, scaling as your needs change on a nearly real-time basis.

Check this video for a more in-depth explanation.

Elasticache also works very nicely with our servers at PHPFog as they are all built on Amazon EC2, and are in fact in the same data center. To get the whole thing working with our blog, I had to do the following.

  1. Create a security group. In order for PHPFog to talk to your own Elasticache cluster, you have to create a security group that contains PHPFog’s AWS ID. There is documentation on the PHPFog website on how to do this for use with an Amazon RDS server, and the same steps apply for Elasticache.
  2. Provision an Elasticache cluster. I chose to start with a single node, m1.large instance which gives me about 7.5 Gig of RAM to work with at $0.36 an hour per node. I can always add more nodes in the future if I want, and I can even roll down to a smaller instance size by simply creating another cluster.
  3. Let things simmer for a minute. It takes a minute or two for your cluster to initialize.
  4. On WordPress install the W3TC plugin. This plugin allows you to connect up your Elasticache server, and as well offers tons of configurable options for use with things like a CloudFront CDN and more. Its a must have! If you are on Drupal or some other CMS. there are similar modules that achieve the same result.
  5. In W3TC enable whatever types of cacheing you wish to do and set the cache type to memcache. In my case, I chose page cache, minify cache, database cache, and object cache, all of which work with memcache. Additionally I set up our CloudFront CDN from within this same plugin.
  6. In each cache types config page, set your memcache endpoint to the one given by your AWS control panel. If you have multiple nodes, you will have to copy and paste them all into each of these spaces. There is a test button you can hit to make sure your installation is communicating with your memcache server.

That last bit is interesting. You can have multiple clusters with multiple nodes serving as cache servers for a number of different purposes. You can also use the same cache cluster for multiple sites, so long as they are all reachable via your security group settings.

Once everything is configured and working you can log out and let the cacheing being. It helps to click through the site to allow the cache to build up, but this will happen automatically if your site gets a decent amount of traffic. In the AWS control panel you can check on your cache cluster in the CloudWatch tab where you can keep track of how much memory and cpu is being utilized at any given time. You can also set up alerts so that if you run out of cache, you get notified so you can easily add some nodes.

We hope to employ this same cacheing cluster on our main website, as well as a number of our other web properties in the near future.