key: cord-0600210-wm4neu99
authors: Villarroel, Beatriz; Pelckmans, Kristiaan; Solano, Enrique; Laaksoharju, Mikael; Souza, Abel; Dom, Onyeuwaoma Nnaemeka; Laggoune, Khaoula; Mimouni, Jamal; Mattsson, Lars; Soodla, Johan; Castillo, Diego; Shultz, Matthew E.; Aworka, Rubby; Comer'on, S'ebastien; Geier, Stefan; Marcy, Geoffrey; Gupta, Alok C.; Bergstedt, Josefine; Bar, Rudolf E.; Buelens, Bart; Prieto, M. Almudena; Ramos-Almeida, Cristina; Wamalwa, Dismas Simiyu; Ward, Martin J.
title: Launching the VASCO citizen science project
date: 2020-09-22
journal: nan
DOI: nan
sha: 2b2c5bf32a10a68ce7b758cba86e4397846e54e4
doc_id: 600210
cord_uid: wm4neu99

The Vanishing &Appearing Sources during a Century of Observations (VASCO) project investigates astronomical surveys spanning a 70 years time interval, searching for unusual and exotic transients. We present herein the VASCO Citizen Science Project, that uses three different approaches to the identification of unusual transients in a given set of candidates: hypothesis-driven, exploratory-driven and machine learning-driven (which is of particular benefit for SETI searches). To address the big data challenge, VASCO combines methods from the Virtual Observatory, a user-aided machine learning and visual inspection through citizen science. In this article, we demonstrate the citizen science project, the new and improved candidate selection process and give a progress report. We also present the VASCO citizen science network led by amateur astronomy associations mainly located in Algeria, Cameroon and Nigeria. At the moment of writing, the citizen science project has carefully examined 12,000 candidate image pairs in the data, and has so far identified 713 objects classified as"vanished". The most interesting candidates will be followed up with optical and infrared imaging, together with the observations by the most potent radio telescopes.

Anomalous objects in astronomy are a gold mine for expanding our knowledge about extreme physical conditions and identifying new astrophysical phenomena. Anomalies have always fascinated astronomers and many important discoveries were first regarded as such. For instance, when the first optical spectra of the radio-emitting quasars 3C 273 and 3C 48 were acquired, astronomers encountered weird and unusual spectra that they considered anomalous, only to soon understand these quasi-stellar objects were in fact highly redshifted (Matthews & Sandage 1963; Schmidt 1963) . Likewise, when the first pulsars were discovered, (Hewish et al. 1968 ) the unexpected pulsating radio signals were considered so unlikely that Little Green Men were suggested as a serious possibility. With further investigation, astronomers ultimately developed an understanding of the physics underlying these entirely natural, albeit extreme objects.

Some anomalies have come to stay with us as interesting examples of rare astrophysical objects. An example is Przybylskis star, (Przybylski et al. 1963 ) a variable star showing unusual amounts of iron and nickel in its spectrum while having high abundances of e.g. strontium and uranium. Another is the well-known transient η Carinae, whose lightcurve showed a giant outburst followed by a slow fading over decades. Other previously well-known astrophysical anomalies have fallen from prominence, following explanation of the underlying physics or identification of the supposed anomaly as an artifact -for example Halton Arp's redshift anomalies (Arp 1987; Burbidge 2001 ) now believed to be chance overlaps in images, but which were once the subject of a grand quarrel amongst cosmologists in the 1980s.

Some recent anomalies have received much attention in the media; for example 'Oumuamua, a cigar-shaped interstellar visitor that followed a non-gravitationally bound orbit and does not seem like the most common comet (Meech et al. 2017 ), or, Tabby's Star (Boyajian et al. 2016 ), a star with an unusual slow dimming caused by obscuration due to an uneven ring of surrounding dust (Meng et al. 2017) . Ross 128, a red dwarf, also figured in the media due to its unusual emission. These examples may need another few years of examination before we understand they key details of the physical mechanisms involved, and it is possible that once we do understand them, we will no longer even consider them anomalies. The same goes for Fast Radio Bursts (FRBs), a completely novel class of poorly understood transients, for which the responsible mechanism(s) remain a hotly debated topic. Already in the early 2000s, the importance of a state-of-the-art development of methods to identify fascinating anomalies was discussed by e.g. Djorgovski (2000) ; Djorgovski et al. (2001) . The importance of anomalies with respect to Searches of Extra Terrestrial Intelligence (SETI) was carefully discussed in the same papers. A recent work that compiles a list of anomalies is the Breakthrough Listen Exotica Catalog (Lacki et al. 2020) .

One of the successful ways of identifying anomalies is through citizen science projects, where volunteers help scientists in scrutinizing the extremely large datasets assembled by astrononomical surveys. Citizen science projects have already earned a good reputation by leading to interesting discoveries. We can thank the Galaxy Zoo project (Lintott et al. 2008 (Lintott et al. , 2011 for improving our understanding of galaxy evolution, utilizing visual inspection of images of galaxies acquired by the Sloan Digital Sky Survey (SDSS) and subsequent classification according to the most suitable morphological class. An important consequence of this citizen science project was the discovery of "Green peas", a rare class of galaxies with very low masses and high star-formation rates that looked round and green (Cardamone et al. 2009 ).

Interesting astrophysical anomalies such as e.g. Hanny's Voorwerp -a rare quasar ionization echo -and Tabby's Star (KIC 8462852) are the results of such citizen science searches. Citizen science projects are now getting competition from machine learning-based identification of anomalies, see e.g. Baron & Poznanski (2017) ; Giles & Walkowicz (2019). Machine learning is certainly helpful in the analysis of giant datasets, but while modern computing and automated routines can aid the identification of unusual objects, it cannot yet replace the human pattern recognition competence honed by millions of years of evolution.

The most famous citizen science project, the Galaxy Zoo, allowed for an entire community of citizen science projects to be assembled. The Zooniverse has inspired millions of users to join scientists of different fields in exploration of nature. The current Zooniverse projects operate in different ways where a registered (or even an unregistered) user can contribute. The total number of classifications exceeds half a billion. Current projects within time domain astronomy give the user different roles. Backyard worlds shows blinking images taken at different times that permits a user to identify fast-moving objects in WISE/NEOWISE data and mark them in the data. Planet Hunter's Transiting Exoplanet Survey Satellite (TESS) survey lets the user identify and mark possible transits. Superwasp variable stars asks the user to classify light curves into either a well-known category or mark it as junk. Supernova Hunters uses a target image, an older image from Pan-STARRS, and the resulting difference images, and asks the user to distinguish between real supernovae and bogus detections.

Herein, we present the citizen science project related to the Vanishing & Appearing Sources during a Century of Observations (VASCO) project 12 (Villarroel et al. 2016 (Villarroel et al. , 2020 . VASCO is a research program that compares historical data from 1950s sky catalogues to modern sky surveys. Using a 70-year temporal baseline, we target stars that may have appeared or vanished during the last seven decades -extreme phenomena that may be so rare that they are missed by transient sky surveys due to the short time windows. Simultaneously, more conventional strong one-epoch transients may be identified with the same approach. In Villarroel et al. (2020) we identified 150,000 candidate objects that need to be visually inspected, based on the cross-match methods described by Soodla (2019) . Of these, we inspected about 15% with the help of images from the Sloan Digital Sky Survey (SDSS). We found about ∼ 100 red point sources where nearly all were visible in only one epoch and in the red images of the POSS-I survey. The shapes and time scales involved rule out solar system objects, variable stars, low-redshift supernova, and AGN.

It is remarkable that the identified point sources so far have no counterparts in modern transient surveys such as the intermediate Palomar Transient Facility (iPTF), the Gaia survey, or the Catalina Sky Survey. These surveys tend to observe hundreds of short transients only visible in one image in one night, using well-calibrated and homogenous CCD data. These automated surveys tend to discover thousands of flaring and erupting stars, cataclysmic variables, supernovae, GRB afterglows, variable or erupting active galactic nuclei (AGN), and microlensing events. The transients detected by photometry that are deemed interesting enough to follow up on, usually also get a spectrum taken to determine its nature. In this way tens of thousands of transients have been discovered and categorized.

One may expect that if the red transients were caused by variable or flaring stars, the occurrence would happen once more within the time window of the transient surveys. The fact that these point sources have escaped all transient surveys so far suggests they were not phenomena with repetition time scales less than a few years. For example, a star that flares up once per week, would be found by these transient surveys. This suggests that our red transients are not among the most common among transient phenomena, or at least are not among the type of transients the big surveys are interested in following up upon.

In the VASCO citizen science project, we use images from different sky surveys to search for both vanishing and appearing stars, as well as transients. Some of the objects found may be similar to what is found automatically by the large transient surveys, and some will be different.

The VASCO citizen science project combines three different strategies to search for anomalous objects:

1. Direct identification: did the star vanish or appear? The simplest of all questions is approached here with a classical citizen science approach. This part of the citizen science project follows the concept of hypothesis-driven science. 

The web interface is currently using the original sample from 150,000 candidates presented by Villarroel et al. (2020) , obtained through a 30 cross-matching of the USNO and Pan-STARRS catalogues (Soodla 2019) . This sample of 150,000 candidates may host even more fascinating objects than the transients already found, and maybe even one of the real vanishing objects we are looking for. However, this candidate list is large and in need of preprocessing. On the 1st of October 2020, this subset will be replaced by a refined list of candidates without spurious entries caused by easily identifiable problems.

As is true for any archive-based project, VASCO can be affected by the following problems:

• Discovery: Where does the information of interest reside?

• Access: Each astronomical archive has developed its own data access system which makes data querying quite cumbersome if the number of services to be consulted is high.

• Representation: Most of the time, data gathered from different archives cannot be directly compared. In the case of images, different sky coverage, orientation and/or pixel size demand a pre-processing analysis for the comparison to be possible.

All these issues can be largely alleviated if a Virtual Observatory methodology is considered. The Virtual Observatory 5 (VO) is an international initiative that was born in the year 2000 and whose main goal is to guarantee easy and efficient access and analysis of the information hosted in astronomical archives. In particular we have taken advantage of VO to provide citizens with as clean a sample of objects as possible, where most of the instrumental artifacts have been filtered out. In this context two actions were accomplished:

• Removal of USNO sources lying on the vicinity of bright stars: Diffraction spikes are lines radiating from the centers of bright astronomical sources. These features are generated when the incoming light is diffracted by the structure which upholds the secondary mirror in reflecting telescopes. To identify and remove the sources of the sample of 150,000 candidates associated with diffraction spikes, we visually inspected several hundred images to find the distribution pattern of these spurious sources in terms of the apparent magnitude of the bright star and the distance to it. The visual analysis led to the definition of a bright star as an object fulfilling the following two heuristic criteria (Figure 2 .):

-A source whose magnitude in any of the USNO B,R bands is brighter than 12.4 and -A source whose brightest magnitude in the USNO B,R bands fulfills that mag ≤ −0.0995312 * angDist + 14.312 (1) where mag is the brightest USNO magnitude and angDist is the separation between the bright star and the USNO source.

These two conditions are conservative enough to ensure (at the cost of having some degree of contamination) that real sources are not removed,

• Removal of USNO sources not associated with POSS sources: We built a catalogue of POSS sources by running Sextractor (Bertin & Arnouts 1996) . To keep faint sources, a low threshold (just 2σ above the background level) was adopted. Also, to minimize the number of artifacts, we demand a maximum separation between the USNO and the Sextractor-POSS source of 3.5 arcsec as well as a signal-to-noise ratio ≥ 10 for the Sextractor POSS sources.

After applying these filters we ended up with 68 632 sources (45% of the original sample), ready to be analyzed by citizens.

The citizen science project is accessed through the VASCO web interface 6 (Pelckmans & Laaksoharju et al. in prep) . The VASCO web interface differs from the usual Zooniverse web interface in that each "mission" takes longer to fulfil and has more steps. A bigger emphasis has also been put on the playability aspect of the interface to enhance the entertainment factor (Pelckmans & Laaksoharju et al. in prep) .

A brief guide upon arrival to the web page is immediately given through a splash screen (Figure 3 ). Once the user has clicked on the splash screen, he or she can engage in examining the images (Figure 4) .

Each user can decide how deep they wish to study each candidate. The web interface presents the user with a random pair of images, where the left shows an old image and the right, a new image. Old blue are compared with modern blue images, and old red with modern red images. Every time a new pair is shown, the system randomizes which colour band that is displayed. The old and new images have different photometric depths, which the user is asked to take into consideration. In the current implementation, the old images are taken from the POSS surveys and the new images from Pan-STARRS, in order to study the candidates identified by Villarroel et al. (2020) . A user is asked to investigate the images in several different steps: Each of these steps allows anomaly detection through hypothesis-driven science, exploratory science and AI-driven science (Section 1). Moreover, the variety of approaches that a user can choose between covers the search space widely discussed by SETI papers, see e.g. Sheikh et al. (in prep) ; Singam et al. (in prep) .

An underlying artificial intelligence (AI) aimed to help the selection of the most interesting images for the users is in current training. The artificial intelligence learns from the users' image treatment. The main principles and theory behind the design and structure of the web interface are outlined in Pelckmans & Laaksoharju et al. (in prep) . The implementation of the AI into the webpage is described by Castillo (2019) .

To bring forward the most interesting candidates, the AI matches the two images and calculates a matching index that shows how well the two images match in the most central part of the image (see the right "Accuracy" bar in the 3). As a comparison, a user's manual matching is shown in the left accuracy bar next to it. The user's goal is to match better than the AI does. Images with the lowest matching index are generally deemed as interesting and worth following up on (Pelckmans & Laaksoharju et al. in prep) . An example of a matched pair of images is shown in Figure  5 .

The user can choose from five options:

1. "The object is still there."

2. "The image has a defect."

3. "It has moved!" 4. "It has vanished!" 5. "Other."

Sometimes, an object might appear to have moved, while the fields-of-view of the images have been rotated. By rotating the images with the help of the two small squares attached to the Pan-STARRS image, the user can investigate whether the central star actually moved or if the field of view orientation gave rise to such an effect. Other times, defects might plague one or both of the images. Sometimes, the user might note something remarkable he or she cannot put words on, in which we case we ask the user to mark it as "Other" and advise them to use the "Inspect" button, which opens a new window with a commentary field ( Figure 6) . A tutorial and a tutorial video 7 8 is accessible on the webpage. Another tutorial aimed for educational use can be obtained by request.

A citizen science project that generates a large quantity of data also requires a large interactive effort to succeed. The VASCO citizen science project is public and welcomes all interested users to participate. 9 The web page also has a French version 10 , which makes the project accessible for volunteers in French-speaking countries. We will expand the web interface to offer versions in more languages, e.g. Spanish, over time.

The mission of the citizen science project can be easily adapted to strikingly different levels of the user's astrophysics background. In its simplest form, a user can just play with matching two images and attempt to see if a star has vanished or not, which is a goal achievable for children in their early school years. The "Inspect" part of the project, is however having a much more challenging theme, where one has a higher probability of identifying something truly interesting if one has a solid astrophysics background. This step includes images from different epochs and in different colour bands, and may be suitable for university astronomy undergraduate students as a step in training their ability to recognize and identify various type of astronomical sources, but is not limited only to astronomy undergraduates.

The current working mode of VASCO is to collaborate with selected amateur astronomy associations, institutes and educational centres. Figure 7 shows the number of classifications made by the citizen science project as a function of time, which shows a strong recent increase in the number of classifications. At the time of writing, we have obtained about 12,000 classifications. 4.1. Collaborating with schools and amateur associations 7 https://www.youtube.com/watch?v=eM84b6-Z xY 8 https://www.youtube.com/watch?v=gtuF9ISAMRE 9 https://www.su.se/english/research/research-news/look-to-the-sky-and-help-researchers-in-a-new-citizen-science-project-1.496340 10 http://user.it.uu.se/∼kripe367/MLblink.FR/#/ Figure 6 . After pressing the "Inspect" button, ten different images from the POSS surveys and Pan-STARRS are shown. The user is advised to do a careful comparison, take into account the different depth of the given images and asked to remark upon any unexpected findings in the commentary window.

A good citizen science project should preferably be a two-way street, where the interactivity is for the benefit of the user as well as for the scientists. A satisfied user will feel engaged in the project -including the research outcomes -and feel good about his or her contribution to the scientific question. He or she will also learn during the process and feel that there is always more to learn about astronomy and always more fascinating objects to discover. For the users who are the most engaged, we have included the "Inspect" option where the user can investigate the case in 10 different colour bands taken at different times. If a discovery is made, we strongly encourage the user to also submit contact details, so that he or she can be involved in the follow-up studies of the object. This makes it possible for any person who has made an important discovery to be part of the VASCO research team itself and to be granted credit for the discovery. We also provide user feedback in this fashion. Similarly, scientists will have an increased chance to succeed in the goal of finding anomalies by involving more interested users and thereby examining more candidates.

While we do not currently have a big platform with registered users, we are in close contact with selected groups of students and amateur astronomers who wish to participate in the VASCO project. This allows us to interact more closely and adapt the programs.

Collaboration with students can be a good strategy for two reasons; (1) students/pupils can participate as part of their education and are therefore more likely to actually spend the time that is necessary to learn how to make an optimal contribution; (2) they are good control groups for evaluating the effectiveness of the citizen science effort when supervised by their teachers.

Collaborating with amateur astronomer associations has different advantages. Many amateurs are often engaged in astronomical activities in their free time due to their natural interest. Many often have good observational skills and background knowledge obtained through years as dedicated amateurs. This means that we might be more engaged to start with and therefore produce faster and higher-quality results. Through interaction with amateurs and students, we hope the project also will inspire and engage them to learn more about transient astronomy.

We are collaborating with student groups and amateur associations in several countries.

In Nigeria the VASCO citizen science project was organized by CBSS with a supporting grant from IAU/Office of Astronomy for Development. The participants come from Nigeria and Cameroon with a total of 30 people participating including both students and amateur astronomers from different science backgrounds. Each participant conducted the research from his/her home due to the COVID-19 pandemic. The supporting grant was given to the participants to acquire WiFi, which they used for the project. At the end of the project about 3000 images were analyzed by the participants. Some of the participants also reported the discovery of some interesting changes between the PanSTARR and USNO images in the process.

In Constantine, Algeria, a subset of members from the Sirius Astronomy Association has classified thousands of images. The members in the Sirius association report that the interest to connect with VASCO is due to its intriguing aspect with non-conventional astronomy and even searches for extraterrestrial intelligence. They also enjoy the multidisciplinary aspect of the project. The 23 strong team from Sirius that is working with VASCO is diverse with ages ranging between 16 and 40, most of them undergraduate students, with some PhD students and high school students. The 23 strong group is 48% women and 52% men. The members have diverse academic backgrounds, ranging from natural sciences and engineering to humanities and business-related fields.

For the VASCO citizen science project, the team is divided into groups which are in turn sub-divided into pairs. This last division is to enable double checking of the results in addition to ensuring that the job is done in case for whatever reason, one of the members fails to report. The image sets are then distributed among them ranging from 10 to 15 images for each pair at the beginning to reach later 25 to 35 images at least. The members are given a maximum of one week to turn in their work in the form of a text file. The work of every member is then reviewed by the teacher/leader of the team who makes sure that the images are treated in the proper way. Every image has also been carefully inspected through the "Inspect" button. The members are encouraged to comment whenever needed. Feedback is given to each team member through additional Zoom meetings.

Training took place through dedicated virtual workshops. These meetings went much beyond the data processing aspect, branching into topics like the stellar life-cycle, the formation of nebulae and star clusters, extraterrestrial life and astrobiology, the basics of spectrometry and its applications to detecting exoplanets and studying organic molecules in the ISM. It also covered the essential principles of AI (artificial intelligence) and its applications to astronomy, in view of the significant AI component in VASCO.

More than 1000 images were treated during 2 months of activity, and the members in the Sirius association are ramping up the pace. They have also implemented a scheme of "record beaking" challenges to motivate the team members to treat more images by beating their own records.

The Sirius organization also has a side activity, "Learn through VASCO", where members are encouraged to give talks about topics related to VASCO underpinnings and goals such as the life-cycle of stars, the Fermi paradox, the Kardashev scale, Dyson spheres, etc.

Individual students and amateur astronomers from other countries are also participating in the project. The Société Astronomique de France and Sociedad de Astronomia de Puerto Rico (PRAS) have efficiently involved their members to take part in the citizen science project.

In Sweden, we are collaborating with "Vetenskapens Hus" (House of Science) 11 . Vetenskapens Hus is an educational center in Stockholm. They provide activities for students of all ages; from primary school to high school, as well as further training for teachers. The VASCO citizen science project is now included as a part of a course for teachers offered by the center. The course is a one-day seminar event, which is held for the first time on October 14, 2020. It will begin with lectures by VASCO members, followed by a hands-on exercise where the participating teachers learn to use the web interface. The seminar is closed with a discussion session aimed at developing ideas around how the VASCO citizen science project can be implemented in science curricula at various levels. Figure 7 . The performance over time. We show the number of classifications made by the web interface from the early onset of the project, including the beta testing stage between January to March. The number of cases labeled as "vanished" by the citizens is 713. About 880 additional sources have been highlighted as interesting in "Inspect".

The course at Vetenskapens Hus is designed for a Swedish audience only. But the possibility arranging similar seminar events online and in English, with a wider and international audience, has been discussed. The practicalities surrounding such an event have not yet been addressed, though.

In these times of the Covid-19 pandemic, building "virtual" networks is necessary as many schools and universities are closed, and social distancing is encouraged. Much of the organization of a citizen science effort can be accomplished online. Social media can be exploited to market the project and enable efficient communication. The VASCO citizen science project has a Facebook page for interested users 12 . This is not without challenges, though. The advertisement of the project through social media relies on scientists being comfortable and experienced users of social media. Another problem is that not everyone has access to reliable internet connections. In some countries, a large part of the population may not even have access to a computer and steady wifi at home. Cell phones seem to be more common, however. In the future, we hope to be able to adapt the platform to work via smartphone.

The number of interesting candidates resulting from the new, upcoming cross-match processes may reach millions, placing the project into the regime of "big data". VASCO is therefore working to adopt methods from the Virtual Observatory, and on the further development of an artificial intelligence aided by visual inspection of candidates by citizen scientists.

The VASCO citizen science project has now launched and together with schools and amateur associations we are orchestrating a community effort to search for anomalies in astronomical images separated by 70 years. It combines both exploratory-driven and hypothesis-driven approaches to the identification of astrophysical anomalies, with a particular focus on searching for vanishing and appearing objects.

So far, we have made 12,000 classifications. The citizen science project will refine the methods for candidate selection and include new data sets with time.

QSOs, Redshifts and Controversies

VASCO: Developing AI-Crawlers for ML-Blink

Wide Field Surveys in Cosmology

Virtual Observatories of the Future

The First Year of MAXI: Monitoring Variable X-ray Sources, Special Publ. IPCR-127

A System for Cross-matching All-sky Surveys

Technosignatures as a Priority in Planetary Science

We thank Ruben Cubo for help with developing the VASCO web interface. This citizen science effort has been driven by many individuals and societies who have joined our effort towards finding vanishing stars. We thank PRAS in Puerto Rico and Société Astronomique de France (SAF) for helping with the project, as well as all our friends and colleagues who have helped to spread information about the project. This research has made use of the Spanish Virtual Observatory (http://svo.cab.inta-csic.es) supported from the Spanish MINECO/FEDER through grant AyA2017-84089. B.V. is funded by the Swedish Research Council (Vetenskapsrådet, grant no. 2017-06372). M.E.S. acknowledges financial support from the Annie Jump Cannon Fellowship, supported by the University of Delaware and endowed by the Mount Cuba Astronomical Observatory.We thank the IAU/OAD for supporting the participation of West Africans in this project, through the WAROAD office with the OAD special COVID-19 grant.The Pan-STARRS1 Surveys (PS1) and the PS1 public science archive have been made possible through contributions by the Institute for Astronomy, the University of Hawaii, the Pan-STARRS Project Office, the Max-Planck Society and its participating institutes, the Max Planck Institute for Astronomy, Heidelberg and the Max Planck Institute for Extraterrestrial Physics, Garching, The Johns Hopkins University, Durham University, the University of Edinburgh, Queen's University Belfast, the Harvard-Smithsonian Center for Astrophysics, the Las Cumbres Observatory Global Telescope Network Incorporated, the National Central University of Taiwan, the Space Telescope Science Institute, the National Aeronautics and Space Administration under Grant No. NNX08AR22G issued through the Planetary Science Division of the NASA Science Mission Directorate, National Science Foundation Grant No. AST-1238877, the University of Maryland, Eotvos Lorand University (ELTE), the Los Alamos National Laboratory, and the Gordon and Betty Moore Foundation.