key: cord-0779997-91k5jda9
authors: Yu, Siyang; Sun, Si; Yan, Wei; Liu, Guangshuai; Li, Xurui
title: A Method Based on Curvature and Hierarchical Strategy for Dynamic Point Cloud Compression in Augmented and Virtual Reality System †
date: 2022-02-07
journal: Sensors (Basel)
DOI: 10.3390/s22031262
sha: 2d96a73f5bb805b8fd46e02ef2eea8702cf14605
doc_id: 779997
cord_uid: 91k5jda9

As a kind of information-intensive 3D representation, point cloud rapidly develops in immersive applications, which has also sparked new attention in point cloud compression. The most popular dynamic methods ignore the characteristics of point clouds and use an exhaustive neighborhood search, which seriously impacts the encoder’s runtime. Therefore, we propose an improved compression means for dynamic point cloud based on curvature estimation and hierarchical strategy to meet the demands in real-world scenarios. This method includes initial segmentation derived from the similarity between normals, curvature-based hierarchical refining process for iterating, and image generation and video compression technology based on de-redundancy without performance loss. The curvature-based hierarchical refining module divides the voxel point cloud into high-curvature points and low-curvature points and optimizes the initial clusters hierarchically. The experimental results show that our method achieved improved compression performance and faster runtime than traditional video-based dynamic point cloud compression.

Since the twenty-first century, the development of three-dimensional (3D) sensing technology has set off a wave of innovation in Augmented and Virtual Reality Production (AR/VR). In 2007, Google introduced the Street View, which allowed users to view and navigate the virtual world, and now there is a complete version that is commercially available [1] . In 2014, Samsung released the Samsung Gear VR for Galaxy smartphones to gain a completely untethered, easy-to-use experience [2] . In 2020, Coronavirus Disease catalyzed AR retail and VR conferences. Meanwhile, the use of 3D point clouds to represent real-world scenarios in an immersible fashion for AR/VR system has become increasingly popular in recent decades. Apple brought point cloud to mobile devices in 2020 and successfully created a more realistic augmented reality experience [3] . In 2021, Zhongxing Telecom Equipment built the AR Point Cloud Digital Twin Platform and said that the platform had formed a scaled deployment and demonstration effect [4] . Moreover, point clouds have achieved significant success in many areas, e.g., visual communication [5] , auto-navigation [6] , and immersive systems [7] . Usually, a point cloud comprises its geometric coordinates and various attributes such as color, normal, temperature, and depth. It always comes with a large amount of data, which leads to the heavy transmission 1.

An innovative curvature-based hierarchical dynamic PCC method is proposed to reduce the time complexity of the refinement operation and even the overall framework; 2.

The decision-making methods of voxel size and the number of iterations are proposed to obtain a good trade-off between compression quality and compression time.

The remainder of this paper is organized as follows.

Following this introduction, we describe the curvature-based hierarchical dynamic PCC method in detail in Section 2. In Section 3, we selected Redandblack, Soldier, Longdress, Loot, Basketball_player, Dancer, Ricardo and Phil eight sequences, presenting the comparison results between the proposed techniques and the classic video-based algorithm. The experiments showed that the proposed scheme can reduce the overall runtime by 33.63% on average, with clear Rate-Distortion (R.D.) performance benefit and subjective effect improvement. Finally, we conclude the entire paper in Section 4.

In the past decades, the video-based theory has been proven to be a low-cost solution for promoting point cloud applications. This paper first utilizes normal similarity between real point and pre-orientated elemental planes to segment the point cloud frame. Next, curvature characteristic is used to hierarchically optimize the clusters of the first step. Then, these refined clusters are packaged into images, such as a geometric image and a color image. Finally, all images can be compressed with the existing versatile video codec. On account of the realization that the latter two steps can refer to the current literature, the proposed method can be summarized as the following three main steps: initial segmentation, curvature-based hierarchical refinement, and post-processing (including image generation and video-based coding).

The premise of using 2D video coders is that the point cloud model is mapped to a 2D plane in a simple but efficient way. Projecting the model onto the six planes of its bounding box is given priority.

More precisely, we first calculate the normal vectors of all points in the point cloud. Then, the point cloud is divided into six basic categories using the six planes of the predefined unit cube, which can be represented by normal vectors as: (1, 0, 0), (0, 1, 0), (0, 0, 1), (−1, 0, 0), (0, −1, 0), (0, 0, −1), as shown in Figure 1 . The segmentation basis is the similarity of the normal vectors between the real point and the orientation plane, maximizing the dot product of the point's normal vector and the plane's normal vector. The corresponding pseudo-code is Algorithm 1: Segmentation aims to find some patches that satisfy time-coherent, low-distortion, and are convenient for dimensionality reduction. Maximizing time-coherent and minimizing distance and angle distortion is beneficial to the video encoder to fully use the spatiotemporal correlation of point cloud geometry and attribute information. However, the previous work does not guarantee a good reconstruction effect due to auto-occlusions. To avoid it, we then refine the clustering results.

The input of the refine module is the clustered geometric coordinates and attributes. And we look forward to getting some clusters meeting the image or video compression technology requirements, such as smooth borders and less occlusion. Patches with soft edges are incredibly effective in succeeding geometric filling and attributing filling parts, which inspired us to consider adjacent points in the later refining process. Less occlusion means avoiding the projection of different points onto the same location as far as possible by controlling the projection direction. More collisions will cause more information loss, so the normal vector of the projection plane is also deemed as a considerable factor. 

If scoreNormal > bestScore do bestScore = scoreNormal provides an overview of the curvature-based hierarchical refinement, which is a friendlier refining solution to reduce computational complexity and runtime and will not alter the bitstream syntax and semantics or the decoder behavior.

The main steps cover partitions of geometric coordinate space, neighborhood information search, and clustering based on scores. It should be noted that the neighborhood information search comprises the calculation of curvature and the hierarchical search; the final scores extrapolate from the normal vector score, voxel-smooth score, and smooth score. which inspired us to consider adjacent points in the later refining process. Less occlusion means avoiding the projection of different points onto the same location as far as possible by controlling the projection direction. More collisions will cause more information loss, so the normal vector of the projection plane is also deemed as a considerable factor. Figure 2 provides an overview of the curvature-based hierarchical refinement, which is a friendlier refining solution to reduce computational complexity and runtime and will not alter the bitstream syntax and semantics or the decoder behavior. 

Even if only ten neighbors are searched for each point, the clusters updating for whole cloud containing hundreds of thousands of points is laborious. As a result, the refining procedure was suggested to be simplified by adding a voxel-based constraint to the neighborhood search [22] . Inspired by this, we first use uniform voxels to divide geometric space, then perform a neighborhood search on the voxelized point cloud instead of the entire model. The first traversed point in each voxel is selected to identify the voxel instead of the geometric center point due to the following calculations being for integers. Specifically, the coordinates of identifying individual and contained points for each voxel are stored. In the searched nearest-neighboring filled voxels, all interior points that meet the distance limit are regarded as the neighborhood information of the current voxel.

The more extensive the voxel size, the fewer points need to be considered in each neighborhood search and the fewer identification points are explored, which means the difference between the searched neighbors and the actual situation is more significant, as shown in Figure 3 .

whole cloud containing hundreds of thousands of points is laborious. As a result, the refining procedure was suggested to be simplified by adding a voxel-based constraint to the neighborhood search [22] . Inspired by this, we first use uniform voxels to divide geometric space, then perform a neighborhood search on the voxelized point cloud instead of the entire model. The first traversed point in each voxel is selected to identify the voxel instead of the geometric center point due to the following calculations being for integers. Specifically, the coordinates of identifying individual and contained points for each voxel are stored. In the searched nearest-neighboring filled voxels, all interior points that meet the distance limit are regarded as the neighborhood information of the current voxel.

The more extensive the voxel size, the fewer points need to be considered in each neighborhood search and the fewer identification points are explored, which means the difference between the searched neighbors and the actual situation is more significant, as shown in Figure 3 . 

The above partition ideal provides convenient conditions for neighborhood search. However, redundant calculations may occur if all regions are considered equivalent and use the fixed search radius, as shown in Figure 4 . If different clusters are marked with diverse colors, it can be found that there are plenty of points belonging to the same group in the chest and abdomen of the Loot point cloud. Wherein the clustering index is updated iteratively, the renewal results obtained by various search radii are consistent because all neighbors have identical indexes. Accordingly, it is reasonable to use a smaller search radius for some local regions. 

The above partition ideal provides convenient conditions for neighborhood search. However, redundant calculations may occur if all regions are considered equivalent and use the fixed search radius, as shown in Figure 4 . If different clusters are marked with diverse colors, it can be found that there are plenty of points belonging to the same group in the chest and abdomen of the Loot point cloud. Wherein the clustering index is updated iteratively, the renewal results obtained by various search radii are consistent because all neighbors have identical indexes. Accordingly, it is reasonable to use a smaller search radius for some local regions. Considering that the update of the clustering index is related to the neighbors and inseparable from the normal vectors, the curvatures originating from normal vectors on local surfaces are selected to study the search radius. The points of low curvature located on a virtually flat surface need bits of neighbors to reflect the index-change trend. By comparison, the points with high curvature, which have an apparent diversification in the normals between them and adjacents, call for a more comprehensive neighborhood search to prevent the normals from unduly influencing the final result, as shown in Figure 5 . In the intra-cluster, the small-scale neighborhood search increases the number of patches for high-curvature regions but does not affect low-curvature parts. On the contrary, the smallscale neighborhood search results in patches with sharp edges for high-curvature areas at cluster boundary but has little impact on low-curvature areas. Therefore, the search for nearest-neighboring-filled voxels in this paper was completed based on curvature grading to reduce the amount of calculation. Low-curvature zones implement a small-scale neighborhood search, while high-curvature zones conduct a large-scale neighborhood search. Considering that the update of the clustering index is related to the neighbors and inseparable from the normal vectors, the curvatures originating from normal vectors on local surfaces are selected to study the search radius. The points of low curvature located on a virtually flat surface need bits of neighbors to reflect the index-change trend. By comparison, the points with high curvature, which have an apparent diversification in the normals between them and adjacents, call for a more comprehensive neighborhood search to prevent the normals from unduly influencing the final result, as shown in Figure 5 . In the intra-cluster, the small-scale neighborhood search increases the number of patches for highcurvature regions but does not affect low-curvature parts. On the contrary, the small-scale neighborhood search results in patches with sharp edges for high-curvature areas at cluster boundary but has little impact on low-curvature areas. Therefore, the search for nearestneighboring-filled voxels in this paper was completed based on curvature grading to reduce the amount of calculation. Low-curvature zones implement a small-scale neighborhood search, while high-curvature zones conduct a large-scale neighborhood search. on a virtually flat surface need bits of neighbors to reflect the index-change trend. By comparison, the points with high curvature, which have an apparent diversification in the normals between them and adjacents, call for a more comprehensive neighborhood search to prevent the normals from unduly influencing the final result, as shown in Figure 5 . In the intra-cluster, the small-scale neighborhood search increases the number of patches for high-curvature regions but does not affect low-curvature parts. On the contrary, the smallscale neighborhood search results in patches with sharp edges for high-curvature areas at cluster boundary but has little impact on low-curvature areas. Therefore, the search for nearest-neighboring-filled voxels in this paper was completed based on curvature grading to reduce the amount of calculation. Low-curvature zones implement a small-scale neighborhood search, while high-curvature zones conduct a large-scale neighborhood search. Figure 5 . The clustering results obtained by searching with various radii in a specific area (a1 is a low-curvature area, while a2 is a high-curvature area).

We apply the local surface fitting method to calculate curvatures. Firstly, principal component analysis is used to estimate normal vectors. Then, a minimum spanning tree is used to orient the normal vectors. The normal vector of i P is used as the vertical axis to establish a new local coordinate system. Finally, a quadric surface fitting is performed on the local coordinate system, and its fitted surface parameters are used to estimate the curvature at i P, as

where 1 K and 2 K are the two principal curvatures of i P, respectively.

The predictions of curvatures are based on the identification points instead of the entire cloud model, as the number of voxels is far less than the number of actual points. In Figure 1 , the curvatures histogram of the identification points shows that only a few individuals had a relatively high curvature as most areas had excellent flatness. Conse- Figure 5 . The clustering results obtained by searching with various radii in a specific area (a1 is a low-curvature area, while a2 is a high-curvature area).

We apply the local surface fitting method to calculate curvatures. Firstly, principal component analysis is used to estimate normal vectors. Then, a minimum spanning tree is used to orient the normal vectors. The normal vector of P i is used as the vertical axis to establish a new local coordinate system. Finally, a quadric surface fitting is performed on the local coordinate system, and its fitted surface parameters are used to estimate the curvature at P i , as

where K 1 and K 2 are the two principal curvatures of P i , respectively. The predictions of curvatures are based on the identification points instead of the entire cloud model, as the number of voxels is far less than the number of actual points. In Figure 1 , the curvatures histogram of the identification points shows that only a few individuals had a relatively high curvature as most areas had excellent flatness. Consequently, all identification points are classified into low-curvature points and high-curvature points, according to the low-curvature determination ratio, as shown in Figure 6 . Then, we can estimate the neighborhood information hierarchically to reduce complexity. quently, all identification points are classified into low-curvature points and high-curvature points, according to the low-curvature determination ratio, as shown in Figure 6 . Then, we can estimate the neighborhood information hierarchically to reduce complexity. 

For points with different curvature grades, various search radii are used. Traverse all the identification points and the nearest-neighboring voxels for each identification point are stored in a vector and used to compose the final score.

The normal vector score, which refers to the influence of the normal vectors on the clustering index, must be considered first to get a patch with less occlusion. These six projection planes, as mentioned above, are also used to estimate the normal vector score in the refine segmentation process, according to

[ ] orientation p is the normal vector of the p-th projection plane.

The projection results for different planes are shown in Figure 7 . The greater the normal vector score, the fewer points projected to the same position and the fewer collisions 

For points with different curvature grades, various search radii are used. Traverse all the identification points and the nearest-neighboring voxels for each identification point are stored in a vector and used to compose the final score.

The normal vector score, which refers to the influence of the normal vectors on the clustering index, must be considered first to get a patch with less occlusion. These six projection planes, as mentioned above, are also used to estimate the normal vector score in the refine segmentation process, according to where normal[i] is the normal vector of the i-th point, and orientation[p] is the normal vector of the p-th projection plane. The projection results for different planes are shown in Figure 7 . The greater the normal vector score, the fewer points projected to the same position and the fewer collisions occur. Therefore, the calculation of normal vector scores is necessary for each iteration.

are stored in a vector and used to compose the final score.

The normal vector score, which refers to the influence of the normal vectors on the clustering index, must be considered first to get a patch with less occlusion. These six projection planes, as mentioned above, are also used to estimate the normal vector score in the refine segmentation process, according to 

where [ ] normal i is the normal vector of the i-th point, and [ ] orientation p is the normal vector of the p-th projection plane.

The projection results for different planes are shown in Figure 7 . The greater the normal vector score, the fewer points projected to the same position and the fewer collisions occur. Therefore, the calculation of normal vector scores is necessary for each iteration. The neighbors of each point will also affect its corresponding clustering index to avoid uneven boundaries, i.e., the smooth scores, which refers to the influence of adjacent points on the final clustering index. If the number of neighbors corresponding to i P is umNeighbors i n , the amount of computation for smooth scores is 

where N is the number of identification points. The neighbors of each point will also affect its corresponding clustering index to avoid uneven boundaries, i.e., the smooth scores, which refers to the influence of adjacent points on the final clustering index. If the number of neighbors corresponding to P i is numNeighbors i , the amount of computation for smooth scores is

where N is the number of identification points. The appraisal of the smooth scores is undoubtedly demanding but can be simplified by the accumulation of the voxel-smooth score thanks to the partition of geometric coordinate space. The voxel-smooth score for each filled voxel associated with each projection plane needs to be computed first by counting the number of points in the voxel, which are clustered to the projection plane during the refining process. Then, the smooth score can be defined as

where v is the index of the voxel containing the i-th point and p is the projection plane index; nnFilledVoxels[v j ] is the j-th neighboring voxel and voxScoreSmooth is the set of voxel-smooth scores for all the adjacent voxels. Hence, the smooth score would be identical for all points inside a voxel. The normal vector score and the smooth score, as mentioned above, are combined to determine the final clustering index through a weighted linear combination, as

where λ is the influence coefficient of the smooth score on the final score, and its value is specified by [30] . After clustering each point to the projection plane having the highest final score as calculated at Equation (6), the cluster update is completed once.

In the refining process, the total number of loops would be where maxNumIters is the maximum number of iterations and numPlanes is the number of projection planes in the refinement. A small-scale neighborhood search is used for most points in the proposed method, so the numNeighbors i for most points is lower than that of the video-based method. Therefore, the total number of loops diminishes, which successfully reduces the computational complexity.

In summary, the pseudo-code for refinement is Algorithm 2: 

Consistent with classical video-based methods, the connected component extraction algorithm is applied to extract the patches obtained by curvature-based hierarchical refinement, and then we map the connected components to a 2D grid. The mapping process needs to minimize the 2D unused area, and each grid belongs to only one patch. The geometric information of the point cloud is stored in the grid to generate the corresponding 2D geometric image. Similarly, the attribute image can also be easily obtained. To better handle the case of multiple points being projected to the same pixel, the connected component can be projected onto more than one image.

Finally, the generated images are stored as video frames and compressed using the video codec according to the configurations provided as parameters. Details are available in [21, 30] .

We carried out many tests using dynamic point clouds captured by RGBD cameras. The Redandblack, Soldier, Longdress, and Loot four sequences in MPEG 8i Dataset [31] are complete point clouds with a voxel size close to 1.75 mm and a resolution of 1024 × 1024 for texture maps; the basketball_player and dancer two sequences in Owlii Dataset [32] are complete point clouds with a resolution of 2048 × 2048 for texture maps; the Ricardo and Phil are two other sequences in Microsoft Voxelized Upper Bodie's Dataset [33] and are frontal upper body point clouds with a voxel size compact to 0.75 mm and a resolution of 1024 × 1024 for texture maps, as shown in Figure 8 . The point cloud is a set of points (x, y, z) constrained to lie on a regular 3D grid. In other words, it can be assumed to be an integer lattice. The geometric coordinates may be interpreted as the address of a volumetric element or voxel. The attributes of a voxel are the red, green, and blue components of the surface color. Note that each series takes 32 frames for experimentation and comparison. 2D geometric image. Similarly, the attribute image can also be easily obtained. To better handle the case of multiple points being projected to the same pixel, the connected component can be projected onto more than one image. Finally, the generated images are stored as video frames and compressed using the video codec according to the configurations provided as parameters. Details are available in [21, 30] .

We carried out many tests using dynamic point clouds captured by RGBD cameras. The Redandblack, Soldier, Longdress, and Loot four sequences in MPEG 8i Dataset [31] are complete point clouds with a voxel size close to 1.75 mm and a resolution of 1024 × 1024 for texture maps; the basketball_player and dancer two sequences in Owlii Dataset [32] are complete point clouds with a resolution of 2048 × 2048 for texture maps; the Ricardo and Phil are two other sequences in Microsoft Voxelized Upper Bodie's Dataset [33] and are frontal upper body point clouds with a voxel size compact to 0.75 mm and a resolution of 1024 × 1024 for texture maps, as shown in Figure 8 . The point cloud is a set of points (x, y, z) constrained to lie on a regular 3D grid. In other words, it can be assumed to be an integer lattice. The geometric coordinates may be interpreted as the address of a volumetric element or voxel. The attributes of a voxel are the red, green, and blue components of the surface color. Note that each series takes 32 frames for experimentation and comparison. Select the most popular V-PCC as the comparison item to analyze the advantages and disadvantages of the proposed method. For better evaluation of the R.D. quality, the Bjontegaard Delta-Rate (BD-rate) and Bjontegaard Delta Peak Signal-to-Noise Ratio (BD-PSNR) metrics [34] are calculated, which makes the comparison of different compression solutions possible when considering several rate-distortion points. The PSNR, which aims to report the distortion values, is calculated as 

where p and color p are the peak constant value for geometric distortions and color distortions of each reference point cloud, respectively, and MSE is the mean squared error.

Based on this, BD-rate is defined as the average difference between the area integral of the lower curve divided by integral interval and that of the upper curve separated by the integral interval: Select the most popular V-PCC as the comparison item to analyze the advantages and disadvantages of the proposed method. For better evaluation of the R.D. quality, the Bjontegaard Delta-Rate (BD-rate) and Bjontegaard Delta Peak Signal-to-Noise Ratio (BD-PSNR) metrics [34] are calculated, which makes the comparison of different compression solutions possible when considering several rate-distortion points. The PSNR, which aims to report the distortion values, is calculated as PSNR color = 10 log 10 ( (p color ) 2 MSE ), PSNR geometry = 10 log 10 ( 3p 2 MSE ) ,

where p and p color are the peak constant value for geometric distortions and color distortions of each reference point cloud, respectively, and MSE is the mean squared error. Based on this, BD-rate is defined as the average difference between the area integral of the lower curve divided by integral interval and that of the upper curve separated by the integral interval:

where r = a + bD + cD 2 + dD 3 , r = log(R), R is the bitrate; a, b, c, and d are fitting parameters; D is the PSNR; D H and D L are the high and low end, respectively; r 2 and r 1 are the curves. A negative BD-rate indicates that the encoding performance of the optimized algorithm has been improved. On the other hand, BD-PSNR expresses the promotion in the objective quality at the same rate, which is given as

where D = a + br + cr 2 + dr 3 , r H , r L , D 2 (r) and D 1 (r) are the highest logarithm of bitrate, the lowest one, the original curve, and the compared curve, respectively. The larger the BD-PSNR, the better the proposed algorithm. Furthermore, without losing fairness, experiments strictly implement the common test condition for dynamic PCC provided by MPEG [24] . Table 1 provides the BD-rate, BD-PSNR, and runtime savings for V-PCC with a voxel size of four compared to a voxel size of two. As the voxel size increases, the encoder's runtime is reduced by 53.73% on average, but the related geometric and color quality suffers a severe loss. In detail, the D1 bitrate increases by an average of 2.78%, while the Y bitrate rises by an average of 3.32%. MPEG explains that the large voxel size is more suitable for real-time applications because high-precision reconstruction is not required in this case. However, we are inclined to promote the real-time capability of compression schemes while ensuring high quality. Therefore, we focus on weakening complexity that retains a great deal of quality based on the voxel size equal to two. Figure 9 describes the impact of the maximum number of iterations on the geometric performance, color performance, and runtime. Results display that although the number of iterations increased eight times, D1-PSNR only increased by less than 0.1%. In terms of color, the compression result after ten iterations was not the worst, and the outcome of 90 iterations was not the best. Meanwhile, the cost of time steadily increased. In summary, the rise in the number of iterations has little impact on geometric performance, unstable attribute optimization, and time. Consequently, we directly suggest lessening the value in [24] , which is 10. 

According to the common test condition, the search radius for high-curvature points is set to 96. The other radius suitable for low-curvature points needs to be further analyzed, as shown in Figures 10 and 11 . We used two types of point clouds for experiments. 

According to the common test condition, the search radius for high-curvature points is set to 96. The other radius suitable for low-curvature points needs to be further analyzed, as shown in Figures 10 and 11 . We used two types of point clouds for experiments. 

According to the common test condition, the search radius for high-curvature is set to 96. The other radius suitable for low-curvature points needs to be furthe lyzed, as shown in Figures 10 and 11 . We used two types of point clouds for experi The R.D. performance is built up when the radius decreases slightly but de sharply with further decrease. Even when the radius is less than 25, both the geo and the color information show a huge loss. Regarding time consumption, when dius is greater than 36, the saving rate of encoder's runtime is not more than 30%. H the radii equal to 25 or 36 are substituted into the analysis of low-curvature ratio, as s in Figure 10 . The radius of 25 is outstanding on time but poor on quality. On the con no matter what the low-curvature ratio is, the compression result of the encoder radius of 36 is satisfactory. Considering that the ultimate goal is to reduce the runtim low-curvature ratio is set as 0.92 and the search radius of low curvature is set as 36

All point clouds have a similar change trend on their R.D. curves. Figure 12 pr a detailed performance description of Redandblack on geometry and color. The curv based hierarchical method has a performance like V-PCC at a low bpp, but less cost with better quality at a high bpp. It is undeniable that the proposed method is va the dynamic condition. From Tables 2 and 3, our method is much better than V-PCC, with a voxel size in geometry and color performance. However, there is also an average 50.52% incre runtime due to the reduction of voxel size which enhances the complexity of the borhood search and the determination of final score. This is negligible in most applic The R.D. performance is built up when the radius decreases slightly but declines sharply with further decrease. Even when the radius is less than 25, both the geometry and the color information show a huge loss. Regarding time consumption, when the radius is greater than 36, the saving rate of encoder's runtime is not more than 30%. Hence, the radii equal to 25 or 36 are substituted into the analysis of low-curvature ratio, as shown in Figure 10 . The radius of 25 is outstanding on time but poor on quality. On the contrary, no matter what the low-curvature ratio is, the compression result of the encoder with a radius of 36 is satisfactory. Considering that the ultimate goal is to reduce the runtime, the low-curvature ratio is set as 0.92 and the search radius of low curvature is set as 36.

All point clouds have a similar change trend on their R.D. curves. Figure 12 provides a detailed performance description of Redandblack on geometry and color. The curvaturebased hierarchical method has a performance like V-PCC at a low bpp, but less costly and with better quality at a high bpp. It is undeniable that the proposed method is valid for the dynamic condition. The R.D. performance is built up when the radius decreases slightly but decline sharply with further decrease. Even when the radius is less than 25, both the geometr and the color information show a huge loss. Regarding time consumption, when the r dius is greater than 36, the saving rate of encoder's runtime is not more than 30%. Henc the radii equal to 25 or 36 are substituted into the analysis of low-curvature ratio, as show in Figure 10 . The radius of 25 is outstanding on time but poor on quality. On the contrar no matter what the low-curvature ratio is, the compression result of the encoder with radius of 36 is satisfactory. Considering that the ultimate goal is to reduce the runtime, th low-curvature ratio is set as 0.92 and the search radius of low curvature is set as 36.

All point clouds have a similar change trend on their R.D. curves. Figure 12 provide a detailed performance description of Redandblack on geometry and color. The curvatur based hierarchical method has a performance like V-PCC at a low bpp, but less costly an with better quality at a high bpp. It is undeniable that the proposed method is valid fo the dynamic condition. From Tables 2 and 3 , our method is much better than V-PCC, with a voxel size of fou in geometry and color performance. However, there is also an average 50.52% increase i runtime due to the reduction of voxel size which enhances the complexity of the neigh borhood search and the determination of final score. This is negligible in most application From Tables 2 and 3, our method is much better than V-PCC, with a voxel size of four in geometry and color performance. However, there is also an average 50.52% increase in runtime due to the reduction of voxel size which enhances the complexity of the neighborhood search and the determination of final score. This is negligible in most applications that do not have extreme real-time constraints. At the same time, data compared with a voxel size of two shows that the proposed approach improves performance and efficiency obviously, saving an average of 33.63% of the total time. After the tradeoff between real-time performance and accuracy, our method can achieve clear quality improvement and, in most cases, shorten the encoder's runtime. Besides, the visual effects of our method, as compared with V-PCC, are demonstrated in Figures 13-15 . Clearly, the point clouds compressed by a voxel size of four are generally unsmooth and have apparent cracks. The point clouds condensed by a voxel size of two outperform voxel size of four, but there are also some cracks. The point clouds conducted by our approach are closest to the original point clouds. Therefore, the method proposed in this paper can achieve better visual effects than V-PCC, consistent with the results obtained from the analysis of R.D. performance, as described earlier. 

In this paper, we proposed an improved dynamic PCC method based on curvature estimation and hierarchical strategy to reduce the runtime of the video-based compression scheme and obtain an apparent quality gain. Firstly, the proposed method segments original data into six primary clusters utilizing normal similarity. Secondly, we suggested a curvature-based hierarchical refining approach to optimize clusters. Finally, image generation technology and video codec were used to map point cloud to 2D image and compression.

The curvature-based hierarchical method's specific flow can begin with generating voxelized identification points by the partition of geometric coordinate space. Next, classify identification points into low-curvature points and high-curvature points. Then, estimate the neighboring voxels and final scores hierarchically. Last, converge each point to the cluster associated with the highest score to obtain patches with smoother boundaries and fewer repeated points.

The experimental results show that the proposed scheme can save 33.63% compression time on average, with clear R.D. performance benefits and subjective effect boost, and is suitable for most AR/VR applications. However, the curvature-based hierarchical method requires only the characteristics of geometric space. In future work, we will consider using the properties of the attribute space to improve compression quality. 

Geometry Encoding of Duplicate Points

Zhongxing Telecom Equipment Home Page

Real Time Machine Vision and Point-Cloud Analysis for Remote Sensing and Vehicle Control

Lossless dynamic point cloud geometry compression with inter compensation and traveling salesman prediction

Low-cost augmented reality systems via 3D point cloud sensors

Point cloud data reduction with tangent function method for nurbs curve and surface fitting

Compression of 3-D point clouds using hierarchical patch fitting

Projection based dynamic point cloud compression using 3DTK toolkit and H.265/HEVC

3D point cloud compression using conventional image compression for efficient data transmission

Intra-frame context-based octree coding for point-cloud geometry

Graph-based static 3D point clouds geometry coding

A novel point cloud compression algorithm based on clustering

A sampling-based 3D point cloud compression algorithm for immersive communication

Towards 6DoF HTTP adaptive streaming through point cloud compression

Real-time compression of point cloud streams

Geometry coding for dynamic voxelized point clouds using octrees and multiple contexts

Toward an efficient representation of visual reality

Survey of light field data compression

Emerging MPEG standards for point cloud compression

New Proposal] Grid-based partitioning

ISO/IEC JTC1/SC29/WG11 (MPEG) Input Document M47600

Data-adaptive packing method for compression of dynamic point cloud sequences

Geometry-guided 3D data interpolation for projection-based dynamic point cloud coding

Advanced 3D motion prediction for video-based dynamic point cloud compression

Dynamic point cloud geometry compression via patch-wise polynomial fitting

Improved patch packing for the MPEG V-PCC standard

Adaptive plane projection for video-based point cloud coding

3D motion estimation and compensation method for video-based point cloud compression

Common test conditions for point cloud compression

ISO/IEC JTC1/SC29/WG11 (MPEG) Output Document N18665

A. 8i voxelized full bodies-A voxelized point cloud dataset

ISO/IEC JTC1/SC29/ WG11 (MPEG) Input Document M40059

Owlii dynamic human mesh sequence dataset

Microsoft voxelized upper bodies-A voxelized point cloud dataset

ISO/IEC JTC1/SC29/ WG11 (MPEG) Input Document M38673

Calculation of Average PSNR Differences between RD Curves

Informed Consent Statement: Not applicable.

The authors declare no conflict of interest.