I haven’t failed, I’ve just tried a lot of ML approaches that don’t work – andromeda yelton


	Skip to content

	
													andromeda yelton

										
						Menu					
				
			
				Home
	About
	Contact
	Resume


	HAMLET
	LITA
	Talks
	Machine Learning (ALA Midwinter 2019)
	Boston Python Meetup (August 21, 2018)
	SWiB16
	LibTechConf 2016
	Code4Lib 2015 Keynote
	Texas Library Association 2014
	Online Northwest 2014: Five Conversations About Code
	New Jersey ESummit (May 2, 2013)
	Westchester Library Association (January 7, 2013)
	Bridging the Digital Divide with Mobile Services (Webjunction, July 25 2012)


			I haven’t failed, I’ve just tried a lot of ML approaches that don’t work
				
			
				Andromeda			
							
					Uncategorized				
			
			
				April 16, 2021			

					
“Let’s blog every Friday,” I thought. “It’ll be great. People can see what I’m doing with ML, and it will be a useful practice for me!” And then I went through weeks on end of feeling like I had nothing to report because I was trying approach after approach to this  one problem that simply didn’t work, hence not blogging. And finally realized: oh, the process is the thing to talk about…


Hi. I’m Andromeda! I am trying to make a neural net better at recognizing people in archival photos. After running a series of experiments — enough for me to have written 3,804 words of notes — I now have a neural net that is ten times worse at its task. 🎉


And now I have 3,804 words of notes to turn into a blog post (a situation which gets harder every week). So let me catch you up on the outline of the problem:


	Download a whole bunch of archival photos and their metadata (thanks, DPLA!)
	Use a face detection ML library to locate faces, crop them out, and save them in a standardized way
	Benchmark an off-the-shelf face recognition system to see how good it is at identifying these faces
	Retrain it
	Benchmark my new system


Step 3: profit, right? Well. Let me also catch you up on some problems along the way:


Alas, metadata


Archival photos are great because they have metadata, and metadata is like labels, and labels mean you can do supervised learning, right?


Well….


Is he “Du Bois, W. E. B. (William Edward Burghardt), 1868-1963” or “Du Bois, W. E. B. (William Edward Burghardt) 1868-1963” or “Du Bois, W. E. B. (William Edward Burghardt)” or “W.E.B. Du Bois”? I mean, these are all options. People have used a lot of different metadata practices at different institutions and in different times. But I’m going to confuse the poor computer if I imply to it that all these photos of the same person are photos of different people. (I have gone through several attempts to resolve this computationally without needing to do everything by hand, with only modest success.)


What about “Photographs”? That appears in the list of subject labels for lots of things in my data set. “Photographs” is a person, right? I ended up pulling in an entire other ML component here — spaCy, to do some natural language processing to at least guess which lines are probably names, so I can clear the rest of them out of my way. But spaCy only has ~90% accuracy on personal names anyway and, guess what, because everything is terrible, in predictable ways, it has no idea “Kweisi Mfume” is a person.


Is a person who appears in the photo guaranteed to be a person who appears in the photo? Nope.


Is a person who appears in the metadata guaranteed to be a person who appears in the photo? Also nope! Often they’re a photographer or other creator. Sometimes they are the subject of the depicted event, but not themselves in the photo. (spaCy will happily tell you that there’s personal name content in something like “Martin Luther King Day”, but MLK is unlikely to appear in a photo of an MLK day event.)


Oh dear, linear algebra


OK but let’s imagine for the sake of argument that we live in a perfect world where the metadata is exactly what we need — no more, no less — and its formatting is perfectly consistent. 🦄


Here you are, in this perfect world, confronted with a photo that contains two people and has two names. How do you like them apples?


I spent more time than I care to admit trying to figure this out. Can I bootstrap from photos that have one person and one name — identify those, subtract them out of photos of two people, go from there? (Not reliably — there’s a lot of data I never reach that way — and it’s horribly inefficient.)


Can I do something extremely clever with matrix multiplication? Like…once I generate vector space embeddings of all the photos, can I do some sort of like dot-product thing across all of my photos, or big batches of them, and correlate the closest-match photos with overlaps in metadata? Not only is this a process which begs the question — I’d have to do that with the ML system I have not yet optimized for archival photo recognition, thus possibly just baking bad data in — but have I mentioned I have taken exactly one linear algebra class, which I didn’t really grasp, in 1995?


What if I train yet another ML system to do some kind of k-means clustering on the embeddings? This is both a promising approach and some really first-rate yak-shaving, combining all the question-begging concerns of the previous paragraph with all the crystalline clarity of black box ML.


Possibly at this point it would have been faster to tag them all by hand, but that would be admitting defeat. Also I don’t have a research assistant, which, let’s be honest, is the person who would usually be doing this actual work. I do have a 14-year-old and I am strongly considering paying her to do it for me, but to facilitate that I’d have to actually build a web interface and probably learn more about AWS, and the prospect of reading AWS documentation has a bracing way of reminding me of all of the more delightful and engaging elements of my todo list, like calling some people on the actual telephone to sort out however they’ve screwed up some health insurance billing.


Nowhere to go but up


Despite all of that, I did actually get all the way through the 5 steps above. I have a truly, spectacularly terrible neural net. Go me! But at a thousand-plus words, perhaps I should leave that story for next week….


			Share this:
	Twitter
	Facebook
	

Like this:
Like Loading...


			Tagged
	fridAI

	
			Published by Andromeda

		
			Romantic analytical technologist librarian.			
				View all posts by Andromeda			
		

			Published
			April 16, 2021		

	
		Post navigation

		Previous Post this time: speaking about machine learning


		Leave a Reply Cancel reply


	Enter your comment here...
	

		Fill in your details below or click an icon to log in:

			
					Email (required) (Address never made public)
					

					Name (required)
					

					Website
					

			You are commenting using your WordPress.com account.			
				( Log Out / 
				Change )
			
			
			You are commenting using your Google account.			
				( Log Out / 
				Change )
			
			
			You are commenting using your Twitter account.			
				( Log Out / 
				Change )
			
			
			You are commenting using your Facebook account.			
				( Log Out / 
				Change )
			
			
		Cancel

		Connecting to %s

	
	 Notify me of new comments via email.
 Notify me of new posts via email.


				Create a free website or blog at WordPress.com.
				
							
		Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use. 

To find out more, including how to control cookies, see here:
				
			Cookie Policy		
 

	<span>%d</span> bloggers like this: