Abstract. In recent time, connectomics which is filed aimed to construct wiring diagram of the brain at the level of synapses and their connections has gained an increasing interest and is very promising for neuroscience. Advance in technology allow to acquire huge volumes of data and the challenge for researchers and scientist now is storage, visualization, and processing of the data. Developing GPU technology provides effective solutions for computing on large datasets and its visualization. Moreover, several recent work propose techniques for optimized segmentation, reconstruction, proofreading, and visualization of connectome. However, existing solutions are still not efficient enough and there are a lot of research directions for future development of proposed systems.
Keywords. Connectome, electron microscopes, GPU, wiring diagram, reconstruction, segmentation, proofreading, visualization, mipmap, octree.
Brain comprises up to hundreds of billions of nerve cells called neurons that are interconnected and form synaptic connections and transmit electrical signals (Al-Awami et al., 2014). Each individual neuron’s structure consists of a cell body, dendrites, and one axon. Brain of an adult human comprises more than 100 billion neurons and 250 trillion neuron connections and at the same time size of each cell is on the order of nanometers, for example, thin dendritic spine is 50 nm in diameter and synaptic clefts have size of 20 nm (Won-Ki Jeong et al., 2010).
If dendrite receives information from neighbors, an axon transmits it to away. Dendrites and Axon are together called neurites. Depending on the electrical signal the synapse can be excitatory or inhibitory. Other synapse characteristics are position of the post-synaptic terminal and vesicles.
Connectomics is a field dealing with mapping of nervous system networks at the level of synapses and their connections (Lichtman, Pfister & Shavit, 2014). It was introduced in 1970 with the investigation of worm nervous system and due to technological advance is receiving interest from researchers. Determining the connectome, wiring diagram of the brain at the resolution of synapses and neurons is a challenge of 21st century which is necessary to understand how the brain works and develops and how the memories are constructed (Al-Awami et al., 2014). A major problem for visualization is that data is very complex, i.e. one neuron can have up to 10,000 connections with other neurons.
Motivation and Challenges in Reconstruction, Analyzing and Visualizing the Connectome
Reconstruction of the human connectome is one of the large scientific challenges in current research and connectomics work on reconstruction and mapping of brain’s neural circuit to understand its structure, functionality and with this be able to treat such pathologies as Alzheimer (Beyer et al., 2013). Big endeavor for scientists is acquisition, storage, and processing of huge volume of data. Recent technological advances allow to get huge amount of data with high speed and accuracy. For example, microtome and electron microscopes (EM) have a pixel resolution of 3-5 nm which is enough to obtain neural connections images at the level of single synapses. However, this results in very big volumes (petabytes) of data which requires long time up to several years and algorithms that are capable to process an incomplete data. Moreover, it is essential to be able to visualize obtained data in 3D to understand and proof-read it.
Description of cellular connections and communication should be performed on the order of nanometers to obtain such details as synapse volume, number of single synapse vesicles, location of mitochondrion and every node of Ranvier, glial cell investment, etc. (Lichtman, Pfister & Shavit, 2014). Light microscopes have limitation due to diffraction, therefore researchers use electron microscopy for this due to its high resolution allowing to see cells and organelles. With existing technology, it is now possible to acquire connectomic data at rate of 1 terabyte (16 1 mm2 images) per day which means that to it takes about 6 years to finish a cubic millimeter of brain. To resolve the problem, parallelized image acquisition technologies are developing. Next step is image data alignment which is less challenging due to high resolution. After alignment there goes a data reconstruction which is segmentation of obtained images, i.e. identifying which neuron and glial cell each image pixel represents. This is very problematic due to number of reasons including irregularities in neural objects shapes, absence of information on actual number of cellular objects and synaptic connections in volume, and finally, cell membranes’ intensity values overlapping with other organelles. Therefore, the research now is focused on finding automatic method of brain volumes reconstruction with acceptable error and reasonable processing time. Furthermore, another endeavor in examining brain data is detection of features such as mitochondria and synaptic vesicles again with reasonable error and time. Another challenges are storage of large amount of data and its transmission from microscopes to machines.
As connectomic data is reconstructed it should be visualized with wiring diagram containing location of synapses along a dendrite, synapses sizes, etc. in a reduced way. Figure 1 demonstrates an example where axon of the neuron depicted with dark gray has four synapses on dendritic spines of the neuron illustrated in light grey. Then, layout graph uses less information and represent only the location of axon and dendritic branch point. Finally, connectivity graph only represents simple connections.
GPU Computing in Visualizing Connectome and its Importance
GPU technology started to develop two decades ago and provides parallel processing which optimizes expensive computing processes such as ray-casting at interactive rates (Beyer, Hadwiger & Pfister, 2015). However, the GPU memory is limited and is not expanding t the same rate as acquired data. Therefore, it is necessary to develop effective data structures, architectures, and visualization algorithms.
In general, visualization framework uses an abstraction called visualization pipeline which describes data flow in the system: data acquisition, data processing which consists of stages from pre-processing to filtering, visualization mapping, and rendering. Currently several volume visualization techniques exist including ray-guided and visualization-driven approaches. Ray-casting approach is parallel and deals effectively with missing data and allows early ray termination. In this approach the focus is on output sensitivity which means that running time is influenced by output size rather than input size. Moreover, the working set consists of bricks that are crossed in ray traversal. Bricking is decomposition technique that represents space as a volume divided into sub-volumes, bricks, of the same size. This allows to have multi-resolution hierarchy which is useful to sample the data from the resolution that is suitable for particular screen. Furthermore, image-space decomposition in ray-casting approach allows to process each pixel individually.
For volume rendering various data structures are used including trees, mipmaps, and hierarchical grids, and wavelets. Trees including octrees, kd-trees, and sparse voxel octrees (SVOs) are very popular and provide effective traversal and hierarchy. Mipmaps are multi-resolution pyramids widely used in GPU texture mapping. A good alternative to trees are hierarchical grids with bricking which bricks each resolution level independently. Moreover, instead of tree traversal they provide address space such that data of any resolution is accessed using this “address translation from virtual to physical addresses” (Beyer, Hadwiger & Pfister, 2015).
Amount of data that needs to be handled depends on time when data is being processed. For example, pre-processing data significantly reduces computation time for rendering. On-demand processing allows to render, process, or load data that is only necessary for current view. Special case of this technique is Query-driven visualization with selection. Data can also be processed as it becomes accessible and this method is called streaming which requires a whole data to be available before the start of visualization.
Core Technologies used in Segmentation, Reconstruction, Proofreading, and Visualization of the Connectome
Beyer at al. (2013) propose a framework for data processing and visualization on the order of petavoxels and that can work on incomplete data. The visualization technique utilizes GPU volume ray-casting for which the major limitation is memory size. To resolve issue with accommodation of large volumes researchers developed “out-of-core and multi-resolution volume rendering based on hierarchical octree bricking schemes” (Beyer et al., 2013). Furthermore, the system makes it possible to access any resolution without processing through the whole hierarchy of resolution levels and uses a virtual memory for this. Acquisition equipment is not connected with a supercomputer.
Developed framework comprises data-driven and visualization-driven main parts. The system overview is presented on Figure 2. Firstly, data is obtained with microscope of fixed resolution and stored in acquisition archive. Raw Tile Processing automatically processes EM tiles as they come from the microscope, compresses and stores them in the visualization archive. For each tile this stage creates 2D mipmap and splits this further into sub-tiles for efficient disk storage and access. The data then goes through registration and segmentation stages. In visualization part volume construction is performed and data for specified resolution is constructed and loaded into GPU only if it was requested in ray-casting. Construction is completed in two steps: first, the system identifies 2D sub-tiles that constitute requested 3D block and second, they are stitched into this 3D grid. To deal with missing data multi-threading is introduced in the system and ray-caster replaces missing data block with its lower resolution or if it is not available, skips the block.
Al-Awami et al. (2014) propose a novel technique called NeuroLines which is based on the anatomical structure of axons and dendrites and provides simple 2D subway maps representation that removes complex branching and anatomical details and conserves only topology, connectivity, and synapse sequence as shown on Figure 3. It provides multi-scale visualization of data which are navigation bar (high-level view of working set, where each neurite is depicted with a single colored line), neurite overview (medium-level view showing subset of neurites), and workspace view (low-level detailed view of individual neurites, branches, and synapses). The system is implemented in C++ and OpenGl and requires NVIDIA GPU to perform 3D volume rendering.
Kaynig et al. (2015) developed and presented a pipeline for a partially automated 3D reconstruction technique from EM images. The system can work with large amount of data, provides computer aided proofreading of output, and allows to decrease user participation in the forest classifier initial training. According to the quantitative assessments, scientists obtained 27,000 μm3 of reconstructed data which can be considered as a largest volume of automatically reconstructed mammalian brain image.
The pipeline overview is shown on Figure 4. First stage focuses on 2D segmentation of images with high resolution. First, random forest classifier is trained on manual membrane annotations to detect membrane and afterwards, based on the output segmentation hypotheses per section are produced. Random forests allow to parallelize training and prediction processes and deal effectively with overfitting and therefore, produce good generalization values. User interaction is minimal and system requires user only to make membrane annotations and corrections on output in a feedback loop. Then, 2D segmentations are formed into 3D objects through segmentation fusion process. Final stage is proofreading performed by user or using semi-automatic tool Mojo.
As data is acquired it needs to be segmented, labeled, and analyzed. Currently, segmentation is a main bottleneck in connectomics field and existing methods are mostly manual and fully automatic ones produce heavier errors Ai-Awami et al. (2016). Therefore, it is necessary to use semi-automatic proofreading method which needs interaction from the user.
Ai-Awami et al. (2016) developed new visualization system, NeuroBlocks, for analyzing the state, progress, and development of EM data. This is a web application suitable for multiple users and provides visualization of image and volume in 2D and 3D interactively. Design of NeuroBlocks is represented on Figure 5 and it explicitly shows that the system allows its users not only to analyze and study the data but also keep track of progress and make notes. Its main view is segmentation state pixel view which is data-oriented and shows current state of a segmentation process. Here pixel is a term used to describe one segment constructed in a hierarchy as following: segment ∈ dendrite ∈ neuron. User then inputs settings on sorting and filtering of pixel view. Pixel view has three modes: nested hierarchy, tree map, and flat view. Moreover, multi-scale visualization of pixels is developed which ensures that whole dataset is visible on the screen. It is computed on the sorting order and then collects specified pixels into one “super-pixel”. To navigate between various segmentation stages researchers developed an approach based on timeline rather than on graphs. In addition, with NeuroBlocks it is possible to track user defined attribute over time such as user activity, number of changes and status of the task.
System also offers segment and object views which are more detailed and connectivity view which shows object with its connected neighbours. In addition to this, NeuroBlocks pipeline provides management views such as task view and user view. Latter is more oriented on project managers who can manage different user, assign them to groups with specific views and functionalities. Other users of NeuroBlocks are segmenters and proofreaders, and senior segmenters. Unlike project managers, segmenters can only work on segments and tasks but need approval of supervisor for the changes to be accepted. User space also includes auditors that guarantee good project quality standards and as for other user tracks their working progress and get summary statistics.
In NeuroBlocks system data is stored in a MongoDB database in a multi-resolution format on a file system which is shared and can be accessed not only directly but also from server. Furthermore, when changes to the data are made only they are saved and not the whole data volume.
Beyer et al. (2013) introduce ConnectomeExplorer query-guided visual analysis system. The main characteristic of the system is that is allows its user to form and find an answer for a domain-specific questions though interactive investigation of data or using dynamic queries which are translated into a query algebra via interface. Three kinds of queries are available: spatial queries focusing on regions of interest, topological queries that identify neurons connections, and attribute queries that consists of automatically calculated attributes such as object volume or manually labeled ones. Queries provide results as sets which can be visualized in all provided views and inputted for other queries or saved to the disk. Researchers chose set algebra representation for queries instead of conventional SQL such as sets of objects, tuples of objects, sets, and predicates. Use of query algebra allows to achieve simplicity for user as there is no need to know actual representation of connections, their internal storage mechanisms, and SQL query language. The system supports topological queries as well which are designated with predicate termed as connected and defining objects as “connected” or “part of”. To make it possible for user to dynamically create queries visual query builder is introduced in the system.
Data after segmentation are stored as an image data and associates each voxel with an integer object ID. To manage large number of objects the system provides up to 24 bits for each object ID. System renders synapses as a shaded spheres of small sizes. One of the benefits of this system is that user can select region of interest (ROI) of cylindrical, spherical, and box-shaped forms and for this start and stop points of rays are set in ray-casting according to the particular ROI shape. Finally, simple algebra queries offer statistical analysis.
Haehn et al. (2014) developed two tools for proofreading of large amount of complex EM data: Mojo and Dojo that are targeted both for experienced and novice users. Dojo is more advanced version of Mojo in that it is easier and has 3D rendering of volume data. Both were quantitatively proved to be faster and better than other existing proofreading techniques. Moreover, they provide 3D view for third step of visual proofreading which is faster and more accurate. Mojo has tools that allow to improve split, merge, and adjust errors. Dojo is more non-expert users oriented and therefore, all elements are shown as icons and operations are simple mouse click actions. Furthermore, each tool has its own interaction mode which makes user interface even more simple.
Won-Ki Jeong et al. (2010) developed tools called Serial SECtion REconstruction and Tracing Tool (SSECRETT) and NeuroTrace. SSECRETT provides 2D volume exploration based on slice and manual tracing of axon and support large amount of acquired data. Multiple user can use the system. NeuroTrace provides interactive segmentation and 3D visualization of high quality of data obtained with high-resolution EM. The system utilizes parallel computing which significantly increases performance. Single-pass ray-casting is implemented with the use of OpenGL and CUDA kernels. In NeuroTrace the data management is done with out-of-core approach and memory hierarchy has three stages: GPU cache (contains all volumes that are currently necessary for ray-casting), CPU cache (superset of blocks of GPU cache), and octree cache (octree notes that constituted the volume in hierarchy).
Jurrus et al (2010) present method for learning models based on Radon-like features (RLF) in addition to image pixel intensities that is able to obtain information on cell membranes efficiently such as cell boundaries. First, RLF divides an edge map into regions according to the defined geometry. Second, it computes line segments for all directions. Finally, using extraction function scalar value is found for each directions and thus extracts cell boundaries, mitochondria, and vesicles.
References
Ai-Awami, A., Beyer, J., Haehn, D., Kasthuri, N., Lichtman, J., Pfister, H., & Hadwiger, M. (2016). NeuroBlocks – Visual Tracking of Segmentation and Proofreading for Large Connectomics Projects. IEEE Transactions On Visualization And Computer Graphics, 22(1), 738-746. Retrieved from http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7192653&isnumber=7307919
Al-Awami, A., Beyer, J., Strobelt, H., Kasthuri, N., Lichtman, J., Pfister, H., & Hadwiger, M. (2014). NeuroLines: A Subway Map Metaphor for Visualizing Nanoscale Neuronal Connectivity. IEEE Trans. Visual. Comput. Graphics, 20(12), 2369-2378. http://dx.doi.org/10.1109/tvcg.2014.2346312
Beyer, J., Al-Awami, A., Kasthuri, N., Lichtman, J., Pfister, H., & Hadwiger, M. (2013). ConnectomeExplorer: Query-Guided Visual Analysis of Large Volumetric Neuroscience Data. IEEE Trans. Visual. Comput. Graphics, 19(12), 2868-2877. http://dx.doi.org/10.1109/tvcg.2013.142
Beyer, J., Hadwiger, M., & Pfister, H. (2015). State-of-the-Art in GPU-Based Large-Scale Volume Visualization. Computer Graphics Forum, 34(8), 13-37. http://dx.doi.org/10.1111/cgf.12605
Beyer, J., Hadwiger, M., Al-Awami, A., Won-Ki Jeong, Kasthuri, N., Lichtman, J., & Pfister, H. (2013). Exploring the Connectome: Petascale Volume Visualization of Microscopy Data Streams. IEEE Comput. Grap. Appl., 33(4), 50-61. http://dx.doi.org/10.1109/mcg.2013.55
Haehn, D., Knowles-Barley, S., Roberts, M., Beyer, J., Kasthuri, N., Lichtman, J., & Pfister, H. (2014). Design and Evaluation of Interactive Proofreading Tools for Connectomics. IEEE Trans. Visual. Comput. Graphics, 20(12), 2466-2475. http://dx.doi.org/10.1109/tvcg.2014.2346371
Jurrus, E., Paiva, A., Watanabe, S., Anderson, J., Jones, B., & Whitaker, R. et al. (2010). Detection of neuron membranes in electron microscopy images using a serial neural network architecture. Medical Image Analysis, 14(6), 770-783. http://dx.doi.org/10.1016/j.media.2010.06.002
Kaynig, V., Vazquez-Reina, A., Knowles-Barley, S., Roberts, M., Jones, T., & Kasthuri, N. et al. (2015). Large-scale automatic reconstruction of neuronal processes from electron microscopy images. Medical Image Analysis, 22(1), 77-88. http://dx.doi.org/10.1016/j.media.2015.02.001
Lichtman, J., Pfister, H., & Shavit, N. (2014). The big data challenges of connectomics. Nature Neuroscience, 17(11), 1448-1454. http://dx.doi.org/10.1038/nn.3837
Won-Ki Jeong, Beyer, J., Hadwiger, M., Blue, R., Law, C., & Vazquez-Reina, A. et al. (2010). Ssecrett and NeuroTrace: Interactive Visualization and Analysis Tools for Large-Scale Neuroscience Data Sets. IEEE Comput. Grap. Appl., 30(3), 58-70. http://dx.doi.org/10.1109/mcg.2010.56