Summary of your 'study carrel' ============================== This is a summary of your Distant Reader 'study carrel'. The Distant Reader harvested & cached your content into a collection/corpus. It then applied sets of natural language processing and text mining against the collection. The results of this process was reduced to a database file -- a 'study carrel'. The study carrel can then be queried, thus bringing light specific characteristics for your collection. These characteristics can help you summarize the collection as well as enumerate things you might want to investigate more closely. Eric Lease Morgan May 27, 2019 Number of items in the collection; 'How big is my corpus?' ---------------------------------------------------------- 1 Average length of all items measured in words; "More or less, how big is each item?" ------------------------------------------------------------------------------------ 21338 Average readability score of all items (0 = difficult; 100 = easy) ------------------------------------------------------------------ 44 Top 50 statistically significant keywords; "What is my collection about?" ------------------------------------------------------------------------- 1 oclc 1 link 1 Wikibase 1 Metadata 1 Library 1 FIGURE 1 Digital 1 Data 1 Collection Top 50 lemmatized nouns; "What is discussed?" --------------------------------------------- 220 datum 197 entity 164 project 118 image 105 collection 103 metadata 89 figure 86 data 85 description 74 work 74 tool 73 oclc 66 user 66 pilot 59 system 56 interface 55 model 43 process 42 property 41 application 40 staff 39 type 39 part 39 information 37 concept 36 statement 36 example 36 class 35 source 35 field 35 discoverability 34 view 34 library 33 participant 33 material 32 search 32 reconciliation 32 discovery 31 heading 30 partner 30 item 30 institution 28 relationship 28 environment 26 workflow 26 phase 25 team 25 page 25 connection 24 value Top 50 proper nouns; "What are the names of persons or places?" -------------------------------------------------------------- 184 Wikibase 157 CONTENTdm 122 Data 86 Digital 81 Collection 76 Metadata 43 FIGURE 42 View 41 Library 40 Annotator 38 Image 33 OCLC 33 Discoverability 27 University 27 Transforming 27 OpenRefine 27 Linked 26 Explorer 25 Wikidata 22 Transportation 22 Libraries 22 JSON 22 Cleveland 22 Analyzer 21 Field 20 Public 19 Minnesota 19 MediaWiki 19 Hub 18 Temple 17 Schema.org 17 Retriever 17 Project 14 • 14 LD 14 Describer 13 Pilot 13 Miami 13 IIIF 12 RDF 12 Dogs 11 º 11 User 11 SPARQL 11 Huntington 10 Wikimedia 10 VIAF 10 Phase 10 January 10 Dublin Top 50 personal pronouns nouns; "To whom are things referred?" ------------------------------------------------------------- 66 it 58 we 20 they 19 us 10 you 5 them 3 itself 1 one 1 i 1 https://merrick.library.miami.edu/cubanheritage/chc0468/ 1 https://github.com/wetneb/openrefine-wikibase Top 50 lemmatized verbs; "What do things do?" --------------------------------------------- 520 be 189 link 115 use 74 improve 70 have 59 create 57 transform 52 develop 51 provide 49 describe 41 include 38 base 38 add 36 find 35 make 35 help 34 see 32 associate 29 depict 26 evaluate 25 need 25 manage 23 work 22 test 22 do 22 build 21 represent 20 update 19 • 18 show 18 share 17 look 17 define 16 relate 16 reconcile 16 generate 15 give 14 embed 13 support 13 select 13 exist 13 catalog 13 carry 13 allow 12 think 12 search 12 orient 12 illustrate 12 display 12 bring Top 50 lemmatized adjectives and adverbs; "How are things described?" --------------------------------------------------------------------- 75 more 57 new 52 other 51 also 47 not 47 large 39 online 32 related 30 digital 30 different 26 descriptive 24 well 22 out 20 local 20 explorer 20 creative 20 contextual 18 subject 18 most 17 initial 17 cultural 16 first 15 up 14 separate 13 such 13 only 13 as 12 specific 11 single 11 same 11 important 11 additional 10 several 10 rich 10 quickly 10 many 10 great 10 depicted 9 very 9 useful 9 then 9 potential 9 external 8 current 7 unique 7 simple 7 significant 7 shared 7 rather 7 locally Top 50 lemmatized superlative adjectives; "How are things described to the extreme?" ------------------------------------------------------------------------- 6 most 2 good 1 least 1 late 1 great Top 50 lemmatized superlative adverbs; "How do things do to the extreme?" ------------------------------------------------------------------------ 12 most 2 well Top 50 Internet domains; "What Webbed places are alluded to in this corpus?" ---------------------------------------------------------------------------- 131 researchworks.oclc.org 20 orcid.org 16 www.w3.org 16 www.oclc.org 12 www.mediawiki.org 12 hdl.huntington.org 11 reflections.mndigital.org 11 cdm16002.contentdm.oclc.org 9 help.oclc.org 8 en.wikipedia.org 7 cdm16014.contentdm.oclc.org 6 merrick.library.miami.edu 6 digital.library.temple.edu 4 www.wikidata.org 4 w3c.github.io 4 iiif.io 4 doi.org 4 cplorg.contentdm.oclc.org 4 cdm16003.contentdm.oclc.org 3 schema.org 2 www.wikipedia.org 2 www.python.org 2 www.php.net 2 www.nngroup.com 2 www.geonames.org 2 www.dublincore.org 2 shex.io 2 search.google.com 2 projectmirador.org 2 pro.dp.la 2 openrefine.org 2 linked.art 2 iphylo.org 2 github.com 2 dublincore.org 2 dp.la 2 creativecommons.org 2 cdm17191.contentdm.oclc.org 1 oc.lc 1 cdm15725.contentdm.oclc.org Top 50 URLs; "What is hyperlinked from this corpus?" ---------------------------------------------------- 6 http://www.oclc.org/research/areas/data-science/linkeddata/linked-data-prototype.html 4 http://www.mediawiki.org/wiki/Manual:Pywikibot/Overview 4 http://researchworks.oclc.org/cdmld/screenshots/entity-Q144548.png 4 http://researchworks.oclc.org/cdmld/screenshots/cdm-property-proposal.png 3 http://researchworks.oclc.org/cdmld/screenshots/wikibase-system-architecture.png 3 http://researchworks.oclc.org/cdmld/screenshots/sparql-visualization.png 3 http://researchworks.oclc.org/cdmld/screenshots/retriever-2.png 3 http://researchworks.oclc.org/cdmld/screenshots/retriever-1.png 3 http://researchworks.oclc.org/cdmld/screenshots/phase-diagram.png 3 http://researchworks.oclc.org/cdmld/screenshots/openrefine-project.png 3 http://researchworks.oclc.org/cdmld/screenshots/image-annotator-3.png 3 http://researchworks.oclc.org/cdmld/screenshots/image-annotator-2.png 3 http://researchworks.oclc.org/cdmld/screenshots/image-annotator-1.png 3 http://researchworks.oclc.org/cdmld/screenshots/field-analyzer-2.png 3 http://researchworks.oclc.org/cdmld/screenshots/field-analyzer-1.png 3 http://researchworks.oclc.org/cdmld/screenshots/explorer-7.png 3 http://researchworks.oclc.org/cdmld/screenshots/explorer-6.png 3 http://researchworks.oclc.org/cdmld/screenshots/explorer-5.png 3 http://researchworks.oclc.org/cdmld/screenshots/explorer-3.png 3 http://researchworks.oclc.org/cdmld/screenshots/explorer-2.png 3 http://researchworks.oclc.org/cdmld/screenshots/explorer-1.png 3 http://researchworks.oclc.org/cdmld/screenshots/entity-Q73829.png 3 http://researchworks.oclc.org/cdmld/screenshots/entity-Q73246.png 3 http://researchworks.oclc.org/cdmld/screenshots/entity-Q73226.png 3 http://researchworks.oclc.org/cdmld/screenshots/entity-Q71945.png 3 http://researchworks.oclc.org/cdmld/screenshots/entity-Q166325.png 3 http://researchworks.oclc.org/cdmld/screenshots/entity-Q165895.png 3 http://researchworks.oclc.org/cdmld/screenshots/entity-Q148552.png 3 http://researchworks.oclc.org/cdmld/screenshots/entity-Q147731.png 3 http://researchworks.oclc.org/cdmld/screenshots/entity-Q147700.png 3 http://researchworks.oclc.org/cdmld/screenshots/entity-Q143578.png 3 http://researchworks.oclc.org/cdmld/screenshots/entity-Q142481.png 3 http://researchworks.oclc.org/cdmld/screenshots/describer-1.png 3 http://researchworks.oclc.org/cdmld/screenshots/class-ontology.png 3 http://researchworks.oclc.org/cdmld/screenshots/cdm15725-p16003coll7-14.png 3 http://researchworks.oclc.org/cdmld/screenshots/cdm-property-proposal-is-defined-by.png 3 http://researchworks.oclc.org/cdmld/screenshots/cdm-item-talk-Q148309.png 3 http://researchworks.oclc.org/cdmld/screenshots 3 http://reflections.mndigital.org/?f%5Bcollection 3 http://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Verner+Collection+of+Panoramic+Negatives/field/physic/mode/all/conn/and/order/title 3 http://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Photographs%20of%20the%20California%20Missions%20by%20William%20Henry%20Jackson/field/physic/mode/exact/conn/and 3 http://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm/Palmer+Conner+Collection+of+Color+Slides+of+Los+Angeles%2C+1950+-+1970/field/physic/mode/all/conn/and/order/nosort 3 http://hdl.huntington.org/digital/collection/p15150coll2/search/searchterm 3 http://en.wikipedia.org/wiki/Triplestore 3 http://digital.library.temple.edu/digital/search/collection/p16002coll6!p15037coll19!p15037coll14!p16002coll2/searchterm/YWCA%20Philadelphia%20Branches%20Records/field/digitb/mode/exact/conn/and/order/title/ad/asc 3 http://cplorg.contentdm.oclc.org/digital/collection/p4014coll18/search/searchterm/cleveland%20picture%20collection/field/collec/mode/exact/conn/and/order/sortda/ad/asc 3 http://cdm16014.contentdm.oclc.org/digital/collection/p4014coll18/search/searchterm/jasper+wood/field/creato/mode/all/conn/and 3 http://cdm16003.contentdm.oclc.org/digital/collection/p15150coll2/search/searchterm/Edwin%20Hubble%20Papers/field/physic/mode/exact/conn/and 3 http://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search/searchterm/Templana%20Photograph%20Collection/field/reposa/mode/exact/conn/and 3 http://cdm16002.contentdm.oclc.org/digital/collection/p245801coll0/search/searchterm/Templana%20Event%20Album%20Collection/field/reposa/mode/exact/conn/and Top 50 email addresses; "Who are you gonna call?" ------------------------------------------------- 2 oclcresearch@oclc.org Top 50 positive assertions; "What sentences are in the shape of noun-verb-noun?" ------------------------------------------------------------------------------- 5 contentdm linked data 1 application was originally 1 application was re- 1 application was usable 1 contentdm linked datat 1 data describing cities 1 data is mostly 1 data is openly 1 data is powerful 1 entities are already 1 entities are not 1 entities included identifiers 1 entity is already 1 entity is truly 1 interface is also 1 library linked data 1 library was excited 1 metadata makes greater 1 modeled linked data 1 oclc added equivalent 1 oclc had not 1 oclc was able 1 pilot was more 1 project did not 1 project was well 1 project were all 1 property linking back 1 staff was also 1 staff were grateful 1 staff were incredible 1 staff worked diligently 1 tool is valuable 1 tool provides partners 1 tools provided oclc 1 users make sense 1 wikibase describing lake 1 wikibase is accessible 1 wikibase using contentdm 1 work doing digitization Top 50 negative assertions; "What sentences are in the shape of noun-verb-no|not-noun?" --------------------------------------------------------------------------------------- 1 entities are not common Sizes of items; "Measures in words, how big is each item?" ---------------------------------------------------------- 21338 0yLgw3HoKQ Readability of items; "How difficult is each item to read?" ----------------------------------------------------------- 44.0 0yLgw3HoKQ Item summaries; "In a narrative form, how can each item be abstracted?" ----------------------------------------------------------------------- 0yLgw3HoKQ Transforming Metadata into Linked Data to Improve Digital Collection Discoverability: A CONTENTdm Pilot Project The OCLC CONTENTdm Linked Data Pilot project team consisted of the following OCLC staff: In the CONTENTdm Linked Data Pilot project, OCLC partnered testing new applications built in the Wikibase environment for data retrieval, image annotation, This report describes the course of the CONTENTdm Linked Data Pilot project and its primary CONTENTdm Linked Data Pilot project used the Wikibase environment, which includes several OCLC staff exported CONTENTdm metadata for each suggested collection and created an entity a project for each collection in the program OpenRefine25 (figure 11), which provides tools for data CONTENTdm collection metadata in an OpenRefine project.26 View a larger image online. the Wikibase, OCLC developed a CONTENTdm customization that embeds the Schema.org data https://www.oclc.org/en/events/2020/devconnect-online-2020/devconnect-2020-creating-linked-descriptive-data-for-contentdm.html https://www.oclc.org/en/events/2020/devconnect-online-2020/devconnect-2020-creating-linked-descriptive-data-for-contentdm.html https://researchworks.oclc.org/cdmld/screenshots/google-structured-data-testing-tool.png. https://researchworks.oclc.org/cdmld/screenshots/google-structured-data-testing-tool.png. Transforming Metadata into Linked Data to Improve Digital Collection Discoverability 73