key: cord-0596472-0bfdr23a authors: Sun, Xinyao; Lu, Yi; Sun, Jinghan; Tang, Bohao; Rehak, Kyle D.; Zhang, Shuyi title: Matrix Syncer -- A Multi-chain Data Aggregator For Supporting Blockchain-based Metaverses date: 2022-04-08 journal: nan DOI: nan sha: e163dd1309800d4fb8cccf5ff11e038bc9d2901f doc_id: 596472 cord_uid: 0bfdr23a Due to the rising complexity of the metaverse's business logic and the low-latency nature of the metaverse, developers typically encounter the challenge of effectively reading, writing, and retrieving historical on-chain data in order to facilitate their functional implementations at scale. While it is true that accessing blockchain states is simple, more advanced real-world operations such as search, aggregation, and conditional filtering are not available when interacting directly with blockchain networks, particularly when dealing with requirements for on-chain event reflection. We offer Matrix Syncer, the ultimate middleware that bridges the data access gap between blockchains and end-user applications. Matrix Syncer is designed to facilitate the consolidation of on-chain information into a distributed data warehouse while also enabling customized on-chain state transformation for a scalable storage, access, and retrieval. It offers a unified layer for both on- and off-chain state, as well as a fast and flexible atomic query. Matrix Syncer is easily incorporated into any infrastructure to aggregate data from various blockchains concurrently, such as Ethereum and Flow. The system has been deployed to support several metaverse projects with a total value of more than $15 million USD. Metaverse is a portmanteau of the prefix "meta" (which means "beyond") and the suffix "verse" (shorthand for "universe"). It is derived from Neal Stephenson's science fiction novel Snow Crash [1] and literally refers to a cosmos beyond our physical world. As stated in [2] , the metaverse is predicated on advancements in four key areas: immersive realism, ubiquitous access and identification, interoperability, and scalability. To enable the construction of a completely functional metaverse, each of those areas must be adequately developed. The development can be grouped into several subdomains, including numerous multimedia technologies -network transmission and prototyping, computer graphics, image processing, virtual reality, and augmented reality [2] . Globalization has increased the volume of international communication and cooperation on a global scale, however geographic distance is an objective hindrance that increases costs. Additionally, as a result of the COVID-19 pandemic, many events have been suspended to comply with pandemic preventive standards [3] . These stringent requirements have created a major opportunity for initiatives including teleconferences and virtual gatherings, in which the metaverse could provide significant accessibility to meet those social requirements [3] . Moreover, decentralization has been described as a critical component of initiating the fifth phase of metaverse development [2] . Decentralized development has resulted in the decoupling of the client and server sides of a virtual world system, which the blockchain protocol has facilitated in recent years. The metaverse is expected to connect everyone on the entire globe.Study [4] asserted that blockchain technology is critical for ensuring the sustainability of metaverse ecosystems by ensuring decentralization and fairness. From a macro perspective, a three-layer metaverse architecture is outlined in [3] as 1) ecosystem, 2) interaction, and 3) infrastructure. Each layer is composed of distinct modules that work together to make a comprehensive, interactive, and functional metaverse. As a result, having an interoperable, robust, and low-latency middleware between each module becomes critical for ensuring that diverse virtual worlds are able to seamlessly connect and overlap. As we progress toward a more decentralized metaverse, we must examine the solutions we wish to emphasize in the metaverse and other advancements. Numerous companies, game creators (Roblox 1 and Epic Games 2 ), software giants (Microsoft, Amazon), social media conglomerates (Facebook -now Meta, Twitter), and graphics processor manufacturers (Nvidia) are involved in the metaverse, beginning with online events. Extensive research and development has been conducted to optimize virtual gathering in the metaverse; NVIDIA's Omnivers 3 is a scalable, multi-GPU real-time reference development platform for 3D simulation and design collaboration; and Alibaba's Cloud Metaverse 4 has been released for the purpose of utilizing cloud computing to construct the entire virtual world as a service. Meta recently launched Horizon Worlds 5 , a virtual-reality social networking platform that allows up to 20 avatars to explore, hang out, and build in the virtual realm, as well as a number of revolutionary gadgets, controllers, and supporting hardware, including VR gloves with haptic feedback. However, all of these advancements in multimedia technology are web 2.0 based. The service providers are centralized identity providers, and users' digital identities are produced and held centrally. These advancements are not sufficient to create a transparent, stable, and sustainable digital economy, where digital properties belong to the users, not the operators [5] . The success of Bitcoin [6] , as a decentralized transaction system has garnered significant attention. Later, in 2013, Vitalik Buterin proposed Ethereum [7] , a decentralized computation platform that introduced the use of smart contracts to execute programs autonomously and transparently on the blockchain. Since then, a variety of different public blockchain networks have been established, including Flow 6 , EOS 7 , etc. each of which supports the development of decentralized applications (DApps) and has its own design philosophy aimed at improving the user experience and system performance. Numerous DApp-based metaverses, such as Sandbox 8 and Decentraland 9 , have attracted increasing attention and consumers in recent years, resulting in significant revenues [3] . This demonstrates how decentralization's power could ensure that digital properties are unique, permanent, and transferable, which benefits the metaverse's development and enables the construction of a fair, free, and sustainable society [8] . Web 2.0's inadequacies, along with the existence of public blockchain technology, have gradually increased public awareness of privacy, data rights, censorship, and identity difficulties. These factors have facilitated the transition of users to a more decentralized Web 3.0 metaverse. We are still in the early stages of the development of decentralized technologies. When developing a decentralized application for the metaverse, the developer frequently encounters constraints on on-chain compute power and storage, along with a wide variety of public blockchain interfaces [9] . Due to the increasing complexity of the metaverse's business logic and the requirements of low-latency user experiences, properly reading, writing, and retrieving historical on-chain data in order to support their functional implementation at scale has always been an inevitable challenge [10] . The Graph 10 is a popular decentralized protocol for indexing and querying data on Ethereum-based blockchains. It enables the querying of data that was inaccessible directly before. However, its support provides only a restricted set of interfaces for transforming data from an on-chain structure to a GraphQL-compatible schema. It offers fast and efficient querying of historical blockchain data, but does not address reflective requirements such as making an external service request or initiating transactions in response to on-chain states and received events. Typically, developers can establish their own web 3.0 infrastructure and communicate directly with a selfhosted blockchain node in order to fetch the on-chain state change (event) and perform reflective processing on their own services. Unfortunately, it is widely known that setting up a self-hosted node is costly and time-consuming [11] .There is a demand for utilities that decrease the entrance barrier and make blockchain data more accessible. Infrastructure-asa-Service (IaaS) offerings are among the most critical. Infura 11 is leading the charge, providing developers, decentralized application teams, and corporations from a variety of industries with a suite of tools for connecting their apps to blockchain networks. Alchemy 12 further extends Infura by adding support for Ethereum layer-2 and other public blockchains (e.g., Flow). The most significant ultility they have introduced is the Alchemy Notify API, which uses webhooks to trigger external actions, allowing for on-chain reflective implementations. However, neither of these projects offer a proper way to retrieve historical on-chain events through traditional query requests. Top traditional IT businesses are attempting to improve the virtual social gathering user experience in order to make it a more effective and secure new way of living. To meet the demands of a decentralized metaverse, developers will need to overcome some key issues caused by current web 3.0 limitations. Numerous products and services have been developed to facilitate the development and deployment of smart contracts in the modern era, such as ChainIDE [12] . One of the key constraints on the scalability and accessibility of the majority of decentralized projects is the inability to index and react effectively to on-chain events that trigger self-business logic after the contracts have been deployed. Furthermore, blockchain is currently a niche industry, with one of the key causes being that blockchains lack interoperable infrastructure, which prevents applications from being deployed on a large scale [9] . Thus, a solid infrastructure that connects web 2.0 and web 3.0 development stacks and supports multiple blockchains is essential to accelerate the decentralized metaverse revolution. In this work, we present Matrix Syncer, the ultimate middleware that bridges the data access and event trigger gaps between blockchains and end-user services and applications. Matrix Syncer is intended to make it easier to consolidate on-chain data into a distributed data warehouse while also enabling customizable on-chain state transformation for scalable storage, access, and retrieval. It provides a uniform layer for both on-chain and off-chain states, as well as quick and flexible atomic querying and event triggering. Furthermore, the Matrix Syncer can easily be integrated into any infrastructure to aggregate data from multiple blockchains simultaneously. The metaverse, we believe, is an event-driven environment in which each user interaction should trigger a chain of reflections altering the state of related properties. Such events are typically required by an application in the metaverse to execute reactive activities in response to the current world state and system logic. Matrix Syncer is a cloud-based IaaS platform that allows developers to effortlessly develop the aforementioned workflows without having to worry about blockchain-related setups or maintenance. The overview of the system architecture is shown in Fig. 1 . Matrix Syncer can help developers to create web 3.0 compatible applications while maintaining their familiar web 2.0 development style. As seen in the figure, developers can configure their EOIs by registering them in the Block Sync DB, which registers the EOI's global unique identity derived from the chain type, contract address, and event signature. Each event has its own set of variables that control how the block syncer distributes jobs; the three most critical ones are initBlockHeight, synced-StartBlockHeight, and syncedLatestBlockHeight. Developers define the initBlockHeight when registering a new event. It serves as the starting point for synchronizing a specific event, which is typically the block height of the contract deployment. This is necessary because the contract may have previously been deployed and performed transactions before enrolling on our platform. Here, syncedStartBlockHeight refers to the block height at the time this event was first synced, and syncedLatestBlockHeight refers to the block height that the block syncer scanned for this event the most recently. The block syncer distributes regular synchronization jobs in parallel for each event with the batch size K = min(mB, cL− γ − syncedLatestBlockHeight), where mB denotes the maximum batch size for a given chain and cL indicates the most recently minted block height from connected archive node. mB is an adaptive parameter that can be adjusted based on the transaction per section (TPS) rate as well as the rate of finalized blocks in order to ensure that the processing capacity of the entire system is sufficient for each chain's throughput. γ is a chain-specific parameter that prevents the most recent confirmed blocks from being reverted, as is the case with Ethereum when an archive node receives an uncle block and later reverts it to adhere to the longest chain protocol [13] . As a result, γ can be set to a number (e.g., 5 for Ethereum, it is the default number of blocks for confirmation of freshly minted blocks defined by the Go Ethereum client) to ensure that the synced block information is already persistent in the network. When initBlockHeight is less than syncedStartBlockHeight, Block Syncer will distribute a dedicated group of backfilling jobs to rapidly scan for the target event between [initBlock-Height, syncedStartBlockHeight] in parallel. Once the backfilling task has been completed, the syncedStartBlockHeight of this event will be set to equal to the initBlockHeight. The fetcher is implemented using corresponding interfaces according to different blockchains. It reads all EOIs for a certain block range and converts them to a meaningful data structure that can be stored in a database. The majority of public blockchains provide an interface or SDK for interacting with their nodes in order to perform simple chain state or event queries. For example, on Ethereum, we can use ether.js 13 to fetch events by block range from an archive node. It's a little more complicated when dealing with Flow since Flow's blockchain data is segmented via Sporks. Each spork stores data for a specific block range. When we use the official Flow endpoint, we can only request blocks or events from the current Spork. To address these issues, we designed the ultimate event fetcher as a middleware module that automatically subdivides large block ranges into smaller sections with associated Spork endpoints and then merges all fetched events together to provide a level of usability similar to Ethereum's. The source code is available at [GitHub] . (reveal later to respect to the double-blind rule). In our instance, we just need to construct an adapter for each chain using a suitable library and connect to an archive node capable of retrieving historical data. The user can specify the type of nodes they wish to employ. They can either create their own self-hosted node or provide the endpoint for third-party IaaS platforms such as Infura or Alchemy. Additionally, Matrix Syncer offers its own archive node for multiple chains, providing developers more freedom for cost-effectiveness optimization. When registering an event, developers have the option of specifying a database scheme for converting on-chain events into a database (event store). All event fetchers will feed EOIs into a persistent module as decoded data. The persistent module converts feed-in events into various data structures defined by the developers and saves them in the event store. All subsequent queries will be performed at the data model specified in the schema. We adopted DynamoDB 14 as a cloud database solution to achieve 1) horizontal scaling via managed partitioning and sharding, 2) high availability with assured SLAs, and 3) high-level consistency. After the backfilling process is completed, developers will be able to do advanced queries, such as filtering, pagination, sorting, grouping, and joining result sets. Matrix Syncer is designed to accommodate multiple chains. Hence, it brings the important advantage of allowing a unified data structure for DApps, which support similar business logic on multiple chains. Users can define the same schema for different chains in order to convert heterogeneous raw events into a standard format and facilitate interoperability in the metaverse. We assure end-to-end data completeness for each block fetching task by performing a checksum verification at the 13 https://docs.ethers.io/ 14 https://aws.amazon.com/dynamodb/ If this equation holds, we persisted all registered events, and non-registered events are correctly skipped. In the Nofity Event phase, we will check if the count(notif icationSent) equals to type count(persistedEventsP erT ype), which is the checksum we calculated in the previous step to make sure no registered events are missing during the notification step. These checksums are persisted in Block Sync DB as structured data and can be used for analytic and monitoring purposes. For instance, any checksum verification failure will result in an instant alarm. Developers can then use the persisted checksums of each step to expedite debugging and root cause analysis. To build a fault-tolerant robust system, we employ Apache Kafka 15 on the side to manage inter-module communication and jobs. It permits asynchronous communication, which optimizes the data flow throughout the system. Queues make our intermediate event data persistent, which improves reliability and reduces errors when different pieces of our system are unavailable. Additionally, it can be quickly scaled to distribute workload among a fleet of users during peak periods. Once the event has been passed from the fetcher to the event store, if there are registered subscribers for a synchronized EOI, the event dispatcher will notify them of the structured event information via a webhook. Webhooks enable users to be notified when an EOI occurs on the blockchain. Rather than querying the server continuously to see whether the state has changed, webhooks deliver information as it becomes available, which is far more efficient and advantageous for developers. Webhooks operate by registering a URL endpoint to which notifications should be sent when specified ROI occurs. The developer maintains complete control over the endpoint, which could be a third-party service or one of their own. Matrix Syncer has been deployed in the industry to serve a variety of decentralized metaverse projects, we'll look at two of them here: RiverMen 16 and MatrixWorld 17 , which are both representative projects with good operation cases. Table III -G provides an overview of their market statistics. Rivermen is known as the world's first metaverse project dedicated to exporting traditional Chinese culture and first released in August. MatrixWorld is our first-party project, which was launched in October and is the first multi-chain support metaverse project. It currently supports the Ethereum and Flow networks and is among the top three projects by transaction volume (unofficial) on the Flow network. It is commonly established that dApps must provide a frontend user interface as shown in Figure 2 in order to provide an interactive user experience. RiverMen's website has a feature to display users' own tokens and perform advanced queries among their tokens as well as the entire collection. The typical smart contract's interface is insufficient for those scenarios. However, by defining a mapping scheme in Matrix Syncer, all events emitted by minting and transferring are well tracked in the event store with a timestamp. To meet the aforementioned needs, RiverMen's front-end can easily call query APIs to our event query service endpoint. Additionally, because all events are timestamped when they are saved to the store, developers can perform advanced queries, such as retrieving the entire transaction history of a particular token or obtaining stats such as the number of transactions of a given token within a specified time window. Similarly, in MatrixWorld, users can locate and investigate all land information directly from the map interface 2 (left). We defined minting and transfer mapping schemas for the Ethereum and Flow blockchains and unified their data structures in the event store. Later on, our front-end can simply query the needed information without having to write two data parsers for the distinct chains. We demonstrated that Matrix Syncer platform can significantly reduce the effort required to onboard a new project with 16 https://www.rivermen.io/ 17 https://matrixworld.org/ historical on-chain data querying requirements, regardless of the blockchain it is built on or the number of chains it supports concurrently. Apart from advanced on-chain state queries, event reflection processing is critical for many products with more complicated business logic. RiverMen introduced the RiverSpace 18 token, which is capable of minting new tokens through the fusion of a set of RiverMen tokens as shown in Figure 2 (right). The 3D model of the newly created RiverSpace token is built and rendered dynamically based on the metadata of the RiverMen tokens. However, due to the restricted compute capacity onchain, these procedures must be performed off-chain. Here, the RiverMen team registered an event triggered by the RiverSpace contract that contained information about the token IDs of each RiverMen token used to fuse the newly minted RiverSpace token. By registering the target event and the webhook on Matrix Syncer, they only need to develop rendering and metadata services to handle off-chain processing. Once their services receive the event, they will begin rendering the new 3D model based on the metadata of the component tokens, which can be fetched through their token IDs. Once the rendering process is complete, they can update their own metadata services to enable the front end to display RiverSpace 3D mode. Here, developers are able to concentrate entirely on their own business logic without having to implement or maintain any of the servers or infrastructure required to interact with blockchains. Following the best practices of web services development, Matrix Syncer uses GPL 19 (Grafana/Prometheus/Loki) stack for metrics, alarms, and log viewer for regular DevOps purposes. Aside from that, we also utilize our persisted checksums for real-time alarming and analytics. We set up instant alarms during the Block Fetching Job execution and stream the checksum data to Grafana dashboard to have double insurance. In MatrixWorld, the checksum failure alarm helped us discover that the data sync error was due to an unstable data node provider. In RiverMen, the checksum stats mismatch helped us find a bug in the event processing chain. Furthermore, we ran an analytics query on our persisted checksum stats to help our customers and our team understand each typed event's processing rate and distribution over time. With instant setup, multi-chain support, and cloud-based IaaS, our proposed Matrix Syncer is designed to be the ultimate solution for developing complex metaverse DApps that demand advanced on-chain data operation and event reflection. All of these advantageous infrastructures can help hasten the Fig. 3 . Monitoring dashboard and data analytics with Grafana construction of a functional metaverse that is more userfriendly, interoperable, and accessible. By using the elasticity, high availability, and flexibility of cloud computing, Matrix Syncer bridges the barrier between web 2.0 and web 3.0 by leveraging the strengths of each, and enabling the rapid development of a functional, decentralized metaverse with enhanced user-friendliness, interoperability, and accessibility. The future direction will be to integrate additional decentralized protocols into the present system because, as with the advancement of the blockchain's fundamental technology, the current web 3.0 shortcomings will undoubtedly be resolved in the future. 3d virtual worlds and the metaverse: Current status and future possibilities Investigation and research on the negotiation space of mental and mental illness based on metaverse Metaverse for social good: A university campus prototype Distributed metaverse: creating decentralized blockchain-based model for peer-to-peer sharing of virtual spaces for mixed reality applications A peer-to-peer electronic cash system A next-generation smart contract and decentralized application platform What's holding back blockchain finance? on the possibility of decentralized autonomous finance Make web3. 0 connected Decentralized identifiers for peer-to-peer service discovery Blockchain as a service (baas): Providers and trust Chainide 2.0: Facilitating smart contract development for consortium blockchain The impact of uncle rewards on selfish mining in ethereum