172022.01

Swarm virus

When you consider that the U. There is no BidenBorderCrisis. But there is indeed a border crisis , and according to Logan, the Biden administration knew all along that there would be one but chose to do nothing until it was too late. They know the people are disbursing now. Think of it as a potluck, Domingo said. The host asks people to bring many types of dishes. For a virus, a variety of options allow it not only to infect different hosts, as chik and dengue do with humans and mosquitoes, but also to infect different tissues within the same host.

Polio infections tend to start in the gut and move to the brain; Vignuzzi wanted to study the role that viral diversity plays in the leap. He started by engineering a polio virus that copied its genome with fewer errors than usual.

Next, Vignuzzi and his colleagues used chemicals to induce mutations in this polio virus, enlarging the size of the mutant cloud. The polio traveled from the gut and into the brain, which it then infected quite well. The virus needed a large swarm in order to do its job. Domingo and Vignuzzi pointed out that the popular antiviral medication ribavirin pushes viruses to develop a swarm that is so big and so full of mutations that the resulting viral potluck is missing vital components.

Viral variants also let viruses evolve and spread themselves to new species. In , a rabies outbreak in gray foxes in Humboldt County, north of San Francisco, was traced back to skunk virus that had jumped to foxes.

To see when this jump may have occurred, Monica Borucki , a virologist at the Lawrence Livermore National Laboratory in California, used advanced next-generation genetic sequencing to examine the viral swarms of rabies-infected animals going all the way back to This type of deep sequencing lets researchers search for minor variants in a virus that can acquire mutations and ultimately take over.

And indeed, Borucki found genetic traces of the outbreak virus even in the earliest samples. The results, published in PLOS Neglected Tropical Diseases in , showed that rare viral variants in an individual provide crucial reservoirs of genetic diversity that can help a virus jump species and evolve. It also offered some of the first clues that could help scientists begin to predict what might happen in the future, if they could only deconstruct the viral swarm. Chikungunya is frequently found on the east coast of Africa, where it is transmitted by Aedes aegypti mosquitoes, but Reunion had very few of those mosquitoes.

Instead, the island had a closely related species known as the Asian tiger mosquito Aedes albopictus. Researchers eventually realized that a single mutation in one of the proteins that coat the virus allowed chik to pick the lock of the Asian tiger mosquito cells and enter much more easily. When researchers compared the original strain, from Kenya, to the Reunion one, they found that the Reunion strain was 40 to times better suited to the Asian tiger mosquito—an amazing jump for just one mutation.

Subsequent work revealed that similar mutations had happened at least three more times as the chik outbreak spread throughout countries along the Indian Ocean. He took the original chikungunya strain circulating in Kenya and infected a group of Asian tiger mosquitoes with it.

Like polio, chik moves from one place to another in the body. On the seventh day of this process, Vignuzzi and his postdoc Kenneth Stapleford dissected the mosquitoes, extracting virus from the midgut, salivary glands and saliva, and sequencing the viruses found in each sample. In the midgut, they found many random mutations, but no single mutation appeared in more than one mosquito. The saliva, however, told a different story.

The saliva in three of four mosquitos contained the Reunion mutant. In one of these, the Reunion mutant made up 99 percent of the total virus population. A continuing outbreak meant ongoing opportunities to evolve further. The first mutation had popped up so quickly in the lab that Vignuzzi and Stapleford began to wonder if they could predict further changes to the virus. So they repeated their experiments, but this time they started by infecting the mosquitoes with the Reunion strain.

They let the virus percolate in the mosquitoes for 10 days, to give it more time to acquire new mutations. Again they sequenced the viruses they found in the various mosquito tissues, and they identified two new mutants, both with mutations in the same lock-picking coat protein as the original Reunion mutant, results they published last year in Cell Host and Microbe. Ongoing work, which Vignuzzi presented at the microbiology meeting in June, has involved tracking how these mutations were selected in mice, a stand-in for chik-infected humans.

The swarm concept is forcing some scientists to rethink some of the basic tenets of population genetics. Typically, the fitness of a virus is measured by how many copies of itself it can make compared to another virus, according to Andino.

Virologists like Andino and Domingo argue that the evolutionary fitness of a virus should include its ability to mutate. Since individual virus variants can cooperate and interact, readily swapping proteins in their final product, the smallest unit evolution can select for is the swarm itself.

Only by considering the whole mutant cloud of viruses can scientists hope to understand how they behave and what they might do in the future. No one was surprised when chik reached the Americas. Collectively, we provide evidence that blood transcriptomes from COVID patients represent a promising feature space for applying SL. With increasing efforts to enforce data privacy and security 5 , 9 , 10 and to reduce data traffic and duplication, a decentralized data model will become the preferred choice for handling, storing, managing, and analysing any kind of large medical dataset Particularly in oncology, success has been reported in machine-learning-based tumour detection 3 , 37 , subtyping 38 , and outcome prediction 39 , but progress is hindered by the limited size of datasets 19 , with current privacy regulations 5 , 9 , 10 making it less appealing to develop centralized AI systems.

SL, as a decentralized learning system, replaces the current paradigm of centralized data sharing in cross-institutional medical research. SL provides confidentiality-preserving machine learning by design and can inherit new developments in differential privacy algorithms 40 , functional encryption 41 , or encrypted transfer learning approaches 42 Supplementary Information. Global collaboration and data sharing are important quests 13 and both are inherent characteristics of SL, with the further advantage that data sharing is not even required and can be transformed into knowledge sharing, thereby enabling global collaboration with complete data confidentiality, particularly if using medical data.

Indeed, statements by lawmakers have emphasized that privacy rules apply fully during a pandemic Particularly in such crises, AI systems need to comply with ethical principles and respect human rights Systems such as SL—allowing fair, transparent, and highly regulated shared data analytics while preserving data privacy—are to be favoured. SL should be explored for image-based diagnosis of COVID from patterns in X-ray images or CT scans 15 , 16 , structured health records 12 , or data from wearables for disease tracking Collectively, SL and transcriptomics or other medical data are a very promising approach to democratize the use of AI among the many stakeholders in the domain of medicine, while at the same time resulting in improved data confidentiality, privacy, and data protection, and a decrease in data traffic.

For normalization, we considered all platforms independently, meaning that normalization was performed separately for the samples in datasets A1, A2 and A3. Microarray data datasets A1 and A2 were normalized using the robust multichip average RMA expression measures, as implemented in the R package affy v. To keep the datasets comparable, data were filtered for genes annotated in all three datasets, which resulted in 12, genes.

No filtering of low-expressed genes was performed. All scripts used in this study for pre-processing are provided as a docker container on Docker Hub v 0. As alignment of whole blood transcriptome data can be performed in many ways, we re-aligned all downloaded and collected datasets Supplementary Information ; these were This was done separately for datasets B, D, and E.

As some of the samples were prepared with poly-A selection to enrich for protein-coding mRNAs, we filtered the complete dataset for protein-coding genes to ensure greater comparability across library preparation protocols.

Furthermore, we excluded all ribosomal protein-coding genes, as well as mitochondrial genes and genes coding for haemoglobins, which resulted in 18, transcripts as the feature space in dataset B, 19, in dataset D and 19, in dataset E.

Other than that, no filtering of transcripts was performed. Before using the data in machine learning, we performed a rank transformation to normality on datasets B, D and E. In brief, transcript expression values were transformed from RNA-seq counts to their ranks. This was done transcript-wise, meaning that all transcript expression values per sample were given a rank based on ordering them from lowest to highest value.

The rankings were then turned into quantiles and transformed using the inverse cumulative distribution function of the normal distribution. This leads to all transcripts following the exact same distribution that is, a standard normal with a mean of 0 and a standard deviation of 1 across all samples.

To preprocess the data, we used Keras v. SL builds on two proven technologies, distributed machine learning and blockchain Supplementary Information. The SLL is a framework to enable decentralized training of machine learning models without sharing the data.

It is designed to make it possible for a set of nodes—each node possessing some training data locally—to train a common machine learning model collaboratively without sharing the training data. This can be achieved by individual nodes sharing parameters weights derived from training the model on the local data. This allows local measures at the nodes to maintain the confidentiality and privacy of the raw data.

Notably, in contrast to many existing federated learning models, a central parameter server is omitted in SL. Detailed descriptions of the SLL, the architecture principles, the SL process, implementation, and the environment can be found in the Supplementary Information. The Swarm network is created with a minimum of 3 up to a maximum of 32 training nodes, and each node is a docker container with access to GPU resources. Multiple experiments were run in parallel using this configuration.

We performed 5— permutations per scenario and each permutation took approximately 30 min, which resulted in a total of 8, computer hours. We leveraged a deep neural network with a sequential architecture as implemented in Keras v 2.

Keras is an open source software library that provides a Python interface to neural networks. The Keras API was developed with a focus on fast experimentation and is standard for deep learning researchers. The model, which was already available in Keras for R from the previous study 3 , has been translated from R to Python to make it compatible with the SLL Supplementary Information.

In brief, the neural network consists of one input layer, eight hidden layers and one output layer. From the first to the eighth hidden layer, nodes are reduced from 1, to 64 nodes, and all layers contain a rectified linear unit activation function, a kernel regularization with an L2 regularization factor of 0. The output layer is densely connected and consists of one node and a sigmoid activation function.

The model is configured for training with Adam optimization and to compute the binary cross-entropy loss between true labels and predicted labels.

The model is used for training both the individual nodes and SL. The model is trained over epochs, with varying batch sizes. Batch sizes of 8, 16, 32, 64 and are used, depending on the number of training samples. SL is not restricted to any particular classification algorithm.

We therefore adapted the l1-penalized logistic regression 3 to be used with the SLL in the form of a Keras single dense layer with linear activation. The regularization parameter lambda was set to 0. For most scenarios, default settings were used without parameter tuning. For some of the scenarios we tuned model hyperparameters. For some scenarios we also tuned SL parameters to get better performance for example, higher sensitivity Supplementary Table 8. For example, for AML Fig. For AML Fig.

Supplementary Table 8 provides a complete overview of all tuning parameters used. Different functions are available for parameter merging as a configuration of the Swarm API, which are then applied by the leader at every synchronization interval. The parameters can be merged as average, weighted average, minimum, maximum, or median functions.

Unless stated otherwise, we used a simple average without weights to merge the parameter for neural networks and for the LASSO algorithm. We evaluated binary classification model performance with sensitivity, specificity, accuracy, F1 score, and AUC metrics, which were determined for every test run.

For TB, the performance metrics were collected by running 10 to 50 permutations. For the X-ray images, 10 permutations were performed.

For COVID the performance metrics were collected by running 10 to 20 permutations for each scenario. All metrics are listed in Supplementary Tables 3 , 4. Differences in performance metrics were tested using the one-sided Wilcoxon signed rank test with continuity correction.

All test results are provided in Supplementary Table 5. To run the experiments, we used Python version 3. We used scikit-learn library version 0. Summary statistics and hypothesis tests were calculated using R version 3. Calculation of each metric was done as follows:. No statistical methods were used to predetermine sample size. The experiments were not randomized, but permutations were performed.

Investigators were not blinded to allocation during experiments and outcome assessment. Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Furthermore, it contains the data from the Rhineland Study. Samples received from other public repositories are listed in Supplementary Table 2. Aronson, S. Building the foundation for genomics in precision medicine. Nature , — Haendel, M. Classification, ontology, and precision medicine. Warnat-Herresthal, S. Scalable prediction of acute myeloid leukemia using high-dimensional machine learning and blood transcriptomics.

Wiens, J. Do no harm: a roadmap for responsible machine learning for health care. Price, W. Privacy in the age of medical big data. Berlin, D. Severe Covid Gandhi, R. Mild or moderate Covid He, J. The practical implementation of artificial intelligence technologies in medicine. Kels, C. HIPAA in the era of data sharing. Article Google Scholar.

McCall, B. What does the GDPR mean for the medical community? Lancet , —

blintulecal1982's Ownd

0コメント

1000 / 1000