I haven’t failed, I’ve just tried a lot of ML approaches that don’t work – andromeda yelton Skip to content andromeda yelton Menu Home About Contact Resume HAMLET LITA Talks Machine Learning (ALA Midwinter 2019) Boston Python Meetup (August 21, 2018) SWiB16 LibTechConf 2016 Code4Lib 2015 Keynote Texas Library Association 2014 Online Northwest 2014: Five Conversations About Code New Jersey ESummit (May 2, 2013) Westchester Library Association (January 7, 2013) Bridging the Digital Divide with Mobile Services (Webjunction, July 25 2012) I haven’t failed, I’ve just tried a lot of ML approaches that don’t work Andromeda Uncategorized April 16, 2021 “Let’s blog every Friday,” I thought. “It’ll be great. People can see what I’m doing with ML, and it will be a useful practice for me!” And then I went through weeks on end of feeling like I had nothing to report because I was trying approach after approach to this one problem that simply didn’t work, hence not blogging. And finally realized: oh, the process is the thing to talk about… Hi. I’m Andromeda! I am trying to make a neural net better at recognizing people in archival photos. After running a series of experiments — enough for me to have written 3,804 words of notes — I now have a neural net that is ten times worse at its task. 🎉 And now I have 3,804 words of notes to turn into a blog post (a situation which gets harder every week). So let me catch you up on the outline of the problem: Download a whole bunch of archival photos and their metadata (thanks, DPLA!) Use a face detection ML library to locate faces, crop them out, and save them in a standardized way Benchmark an off-the-shelf face recognition system to see how good it is at identifying these faces Retrain it Benchmark my new system Step 3: profit, right? Well. Let me also catch you up on some problems along the way: Alas, metadata Archival photos are great because they have metadata, and metadata is like labels, and labels mean you can do supervised learning, right? Well…. Is he “Du Bois, W. E. B. (William Edward Burghardt), 1868-1963” or “Du Bois, W. E. B. (William Edward Burghardt) 1868-1963” or “Du Bois, W. E. B. (William Edward Burghardt)” or “W.E.B. Du Bois”? I mean, these are all options. People have used a lot of different metadata practices at different institutions and in different times. But I’m going to confuse the poor computer if I imply to it that all these photos of the same person are photos of different people. (I have gone through several attempts to resolve this computationally without needing to do everything by hand, with only modest success.) What about “Photographs”? That appears in the list of subject labels for lots of things in my data set. “Photographs” is a person, right? I ended up pulling in an entire other ML component here — spaCy, to do some natural language processing to at least guess which lines are probably names, so I can clear the rest of them out of my way. But spaCy only has ~90% accuracy on personal names anyway and, guess what, because everything is terrible, in predictable ways, it has no idea “Kweisi Mfume” is a person. Is a person who appears in the photo guaranteed to be a person who appears in the photo? Nope. Is a person who appears in the metadata guaranteed to be a person who appears in the photo? Also nope! Often they’re a photographer or other creator. Sometimes they are the subject of the depicted event, but not themselves in the photo. (spaCy will happily tell you that there’s personal name content in something like “Martin Luther King Day”, but MLK is unlikely to appear in a photo of an MLK day event.) Oh dear, linear algebra OK but let’s imagine for the sake of argument that we live in a perfect world where the metadata is exactly what we need — no more, no less — and its formatting is perfectly consistent. 🦄 Here you are, in this perfect world, confronted with a photo that contains two people and has two names. How do you like them apples? I spent more time than I care to admit trying to figure this out. Can I bootstrap from photos that have one person and one name — identify those, subtract them out of photos of two people, go from there? (Not reliably — there’s a lot of data I never reach that way — and it’s horribly inefficient.) Can I do something extremely clever with matrix multiplication? Like…once I generate vector space embeddings of all the photos, can I do some sort of like dot-product thing across all of my photos, or big batches of them, and correlate the closest-match photos with overlaps in metadata? Not only is this a process which begs the question — I’d have to do that with the ML system I have not yet optimized for archival photo recognition, thus possibly just baking bad data in — but have I mentioned I have taken exactly one linear algebra class, which I didn’t really grasp, in 1995? What if I train yet another ML system to do some kind of k-means clustering on the embeddings? This is both a promising approach and some really first-rate yak-shaving, combining all the question-begging concerns of the previous paragraph with all the crystalline clarity of black box ML. Possibly at this point it would have been faster to tag them all by hand, but that would be admitting defeat. Also I don’t have a research assistant, which, let’s be honest, is the person who would usually be doing this actual work. I do have a 14-year-old and I am strongly considering paying her to do it for me, but to facilitate that I’d have to actually build a web interface and probably learn more about AWS, and the prospect of reading AWS documentation has a bracing way of reminding me of all of the more delightful and engaging elements of my todo list, like calling some people on the actual telephone to sort out however they’ve screwed up some health insurance billing. Nowhere to go but up Despite all of that, I did actually get all the way through the 5 steps above. I have a truly, spectacularly terrible neural net. Go me! But at a thousand-plus words, perhaps I should leave that story for next week…. Share this: Twitter Facebook Like this: Like Loading... Tagged fridAI Published by Andromeda Romantic analytical technologist librarian. View all posts by Andromeda Published April 16, 2021 Post navigation Previous Post this time: speaking about machine learning Leave a Reply Cancel reply Enter your comment here... Fill in your details below or click an icon to log in: Email (required) (Address never made public) Name (required) Website You are commenting using your WordPress.com account. ( Log Out /  Change ) You are commenting using your Google account. ( Log Out /  Change ) You are commenting using your Twitter account. ( Log Out /  Change ) You are commenting using your Facebook account. ( Log Out /  Change ) Cancel Connecting to %s Notify me of new comments via email. Notify me of new posts via email. Create a free website or blog at WordPress.com. Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use. To find out more, including how to control cookies, see here: Cookie Policy %d bloggers like this: