Search

Thesis
Peer Reviewed

Implications of Geometry and Topology on Deep Learning Capabilities

Mahan, Scott
Advisor(s): Cloninger, Alexander

UC San Diego Electronic Theses and Dissertations (2023)

In this dissertation, we explore the impact of geometry and topology on the capabilities of deep learning models. Learning theory requires fitting a model to data while generalizing to unseen data as accurately possible. Naturally, properties of the underlying data distribution affect the ability of models to learn the risk-minimizing function. In addition, the sampling size and number of parameters required are quantities that can be improved by leveraging geometric and topological information.

In Chapter 2, we show that under certain assumptions on a network activation function, sets of networks with fixed architecture are not closed in Sobolev spaces. This means that we can often approximate some functions and their derivatives to arbitrary accuracy, even if that function cannot be realized exactly by a neural network. However, doing so requires parameters to explode, which provides further insight on the approximation capabilities of neural networks.

Chapter 3 analyzes the generalization benefits of data augmentation. Though data augmentation is widely believed to improve model generalization, we establish novel results with provably tighter bounds on generalization error under algorithmic and distributional assumptions. In particular, algorithms with strong stability criteria have improved generalization under data augmentation. Moreover, invariance properties of the data distribution can ensure that we learn the risk minimizing function with better generalization.

We turn to semi-supervised learning in Chapter 4, where we demonstrate how autoencoders applied to separate charts on manifolds can decrease the model complexity required for reconstruction. Under assumptions on the geometry and topology of a manifold, we characterize how many charts are needed for encoding via linear projections. We also show that this approach has a mild impact on the decoder complexity, which depends only weakly on the ambient data dimension.

Finally, Chapter 5 studies point cloud classification via a linear optimal transport embedding. We provide sufficient conditions for being able to nearly isometrically embed distributions into Euclidean space via input-convex neural networks trained on optimal transport maps. We can then linearly separate classes based on point cloud data sampled from the target distributions. Again, we leverage the underlying geometry of the data to improve model capabilities.

Cover page: Implications of Geometry and Topology on Deep Learning Capabilities

Article
Peer Reviewed

Smartphone-based pathogen diagnosis in urinary sepsis patients

UC Santa Barbara Previously Published Works (2018)

Background

There is an urgent need for rapid, sensitive, and affordable diagnostics for microbial infections at the point-of-care. Although a number of innovative systems have been reported that transform mobile phones into potential diagnostic tools, the translational challenge to clinical diagnostics remains a significant hurdle to overcome.

Methods

A smartphone-based real-time loop-mediated isothermal amplification (smaRT-LAMP) system was developed for pathogen ID in urinary sepsis patients. The free, custom-built mobile phone app allows the phone to serve as a stand-alone device for quantitative diagnostics, allowing the determination of genome copy-number of bacterial pathogens in real time.

Findings

A head-to-head comparative bacterial analysis of urine from sepsis patients revealed that the performance of smaRT-LAMP matched that of clinical diagnostics at the admitting hospital in a fraction of the time (~1 h vs. 18-28 h). Among patients with bacteremic complications of their urinary sepsis, pathogen ID from the urine matched that from the blood - potentially allowing pathogen diagnosis shortly after hospital admission. Additionally, smaRT-LAMP did not exhibit false positives in sepsis patients with clinically negative urine cultures.

Interpretation

The smaRT-LAMP system is effective against diverse Gram-negative and -positive pathogens and biological specimens, costs less than $100 US to fabricate (in addition to the smartphone), and is configurable for the simultaneous detection of multiple pathogens. SmaRT-LAMP thus offers the potential to deliver rapid diagnosis and treatment of urinary tract infections and urinary sepsis with a simple test that can be performed at low cost at the point-of-care. FUND: National Institutes of Health, Chan-Zuckerberg Biohub, Bill and Melinda Gates Foundation.

Cover page: Smartphone-based pathogen diagnosis in urinary sepsis patients

Article
Peer Reviewed

The IRE1α-XBP1 Signaling Axis Promotes Glycolytic Reprogramming in Response to Inflammatory Stimuli

UC Davis Previously Published Works (2023)

Immune cells must be able to adjust their metabolic programs to effectively carry out their effector functions. Here, we show that the endoplasmic reticulum (ER) stress sensor Inositol-requiring enzyme 1 alpha (IRE1α) and its downstream transcription factor X box binding protein 1 (XBP1) enhance the upregulation of glycolysis in classically activated macrophages (CAMs). The IRE1α-XBP1 signaling axis supports this glycolytic switch in macrophages when activated by lipopolysaccharide (LPS) stimulation or infection with the intracellular bacterial pathogen Brucella abortus. Importantly, these different inflammatory stimuli have distinct mechanisms of IRE1α activation; while Toll-like receptor 4 (TLR4) supports glycolysis under both conditions, TLR4 is required for activation of IRE1α in response to LPS treatment but not B. abortus infection. Though IRE1α and XBP1 are necessary for maximal induction of glycolysis in CAMs, activation of this pathway is not sufficient to increase the glycolytic rate of macrophages, indicating that the cellular context in which this pathway is activated ultimately dictates the cell's metabolic response and that IRE1α activation may be a way to fine-tune metabolic reprogramming. IMPORTANCE The immune system must be able to tailor its response to different types of pathogens in order to eliminate them and protect the host. When confronted with bacterial pathogens, macrophages, frontline defenders in the immune system, switch to a glycolysis-driven metabolism to carry out their antibacterial functions. Here, we show that IRE1α, a sensor of ER stress, and its downstream transcription factor XBP1 support glycolysis in macrophages during infection with Brucella abortus or challenge with Salmonella LPS. Interestingly, these stimuli activate IRE1α by independent mechanisms. While the IRE1α-XBP1 signaling axis promotes the glycolytic switch, activation of this pathway is not sufficient to increase glycolysis in macrophages. This study furthers our understanding of the pathways that drive macrophage immunometabolism and highlights a new role for IRE1α and XBP1 in innate immunity.

Cover page: The IRE1α-XBP1 Signaling Axis Promotes Glycolytic Reprogramming in Response to Inflammatory Stimuli

Article
Peer Reviewed

High fat intake sustains sorbitol intolerance after antibiotic-mediated Clostridia depletion from the gut microbiota

UC Davis Previously Published Works (2024)

Carbohydrate intolerance, commonly linked to the consumption of lactose, fructose, or sorbitol, affects up to 30% of the population in high-income countries. Although sorbitol intolerance is attributed to malabsorption, the underlying mechanism remains unresolved. Here, we show that a history of antibiotic exposure combined with high fat intake triggered long-lasting sorbitol intolerance in mice by reducing Clostridia abundance, which impaired microbial sorbitol catabolism. The restoration of sorbitol catabolism by inoculation with probiotic Escherichia coli protected mice against sorbitol intolerance but did not restore Clostridia abundance. Inoculation with the butyrate producer Anaerostipes caccae restored a normal Clostridia abundance, which protected mice against sorbitol-induced diarrhea even when the probiotic was cleared. Butyrate restored Clostridia abundance by stimulating epithelial peroxisome proliferator-activated receptor-gamma (PPAR-γ) signaling to restore epithelial hypoxia in the colon. Collectively, these mechanistic insights identify microbial sorbitol catabolism as a potential target for approaches for the diagnosis, treatment, and prevention of sorbitol intolerance.

Cover page: High fat intake sustains sorbitol intolerance after antibiotic-mediated Clostridia depletion from the gut microbiota

Article
Peer Reviewed

A broad-spectrum synthetic antibiotic that does not evoke bacterial resistance

UC Santa Barbara Previously Published Works (2023)

Background

Antimicrobial resistance (AMR) poses a critical threat to public health and disproportionately affects the health and well-being of persons in low-income and middle-income countries. Our aim was to identify synthetic antimicrobials termed conjugated oligoelectrolytes (COEs) that effectively treated AMR infections and whose structures could be readily modified to address current and anticipated patient needs.

Methods

Fifteen chemical variants were synthesized that contain specific alterations to the COE modular structure, and each variant was evaluated for broad-spectrum antibacterial activity and for in vitro cytotoxicity in cultured mammalian cells. Antibiotic efficacy was analyzed in murine models of sepsis; in vivo toxicity was evaluated via a blinded study of mouse clinical signs as an outcome of drug treatment.

Findings

We identified a compound, COE2-2hexyl, that displayed broad-spectrum antibacterial activity. This compound cured mice infected with clinical bacterial isolates derived from patients with refractory bacteremia and did not evoke bacterial resistance. COE2-2hexyl has specific effects on multiple membrane-associated functions (e.g., septation, motility, ATP synthesis, respiration, membrane permeability to small molecules) that may act together to negate bacterial cell viability and the evolution of drug-resistance. Disruption of these bacterial properties may occur through alteration of critical protein-protein or protein-lipid membrane interfaces-a mechanism of action distinct from many membrane disrupting antimicrobials or detergents that destabilize membranes to induce bacterial cell lysis.

Interpretation

The ease of molecular design, synthesis and modular nature of COEs offer many advantages over conventional antimicrobials, making synthesis simple, scalable and affordable. These COE features enable the construction of a spectrum of compounds with the potential for development as a new versatile therapy for an imminent global health crisis.

Funding

U.S. Army Research Office, National Institute of Allergy and Infectious Diseases, and National Heart, Lung, and Blood Institute.

Cover page: A broad-spectrum synthetic antibiotic that does not evoke bacterial resistance

Article
Peer Reviewed

Host cells subdivide nutrient niches into discrete biogeographical microhabitats for gut microbes

UC Davis Previously Published Works (2022)

Changes in the microbiota composition are associated with many human diseases, but factors that govern strain abundance remain poorly defined. We show that a commensal Escherichia coli strain and a pathogenic Salmonella enterica serovar Typhimurium isolate both utilize nitrate for intestinal growth, but each accesses this resource in a distinct biogeographical niche. Commensal E. coli utilizes epithelial-derived nitrate, whereas nitrate in the niche occupied by S. Typhimurium is derived from phagocytic infiltrates. Surprisingly, avirulent S. Typhimurium was shown to be unable to utilize epithelial-derived nitrate because its chemotaxis receptors McpB and McpC exclude the pathogen from the niche occupied by E. coli. In contrast, E. coli invades the niche constructed by S. Typhimurium virulence factors and confers colonization resistance by competing for nitrate. Thus, nutrient niches are not defined solely by critical resources, but they can be further subdivided biogeographically within the host into distinct microhabitats, thereby generating new niche opportunities for distinct bacterial species.

Article
Peer Reviewed

Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico

UC San Francisco Previously Published Works (2014)

Performing genetic studies in multiple human populations can identify disease risk alleles that are common in one population but rare in others, with the potential to illuminate pathophysiology, health disparities, and the population genetic origins of disease alleles. Here we analysed 9.2 million single nucleotide polymorphisms (SNPs) in each of 8,214 Mexicans and other Latin Americans: 3,848 with type 2 diabetes and 4,366 non-diabetic controls. In addition to replicating previous findings, we identified a novel locus associated with type 2 diabetes at genome-wide significance spanning the solute carriers SLC16A11 and SLC16A13 (P = 3.9 × 10(-13); odds ratio (OR) = 1.29). The association was stronger in younger, leaner people with type 2 diabetes, and replicated in independent samples (P = 1.1 × 10(-4); OR = 1.20). The risk haplotype carries four amino acid substitutions, all in SLC16A11; it is present at ~50% frequency in Native American samples and ~10% in east Asian, but is rare in European and African samples. Analysis of an archaic genome sequence indicated that the risk haplotype introgressed into modern humans via admixture with Neanderthals. The SLC16A11 messenger RNA is expressed in liver, and V5-tagged SLC16A11 protein localizes to the endoplasmic reticulum. Expression of SLC16A11 in heterologous cells alters lipid metabolism, most notably causing an increase in intracellular triacylglycerol levels. Despite type 2 diabetes having been well studied by genome-wide association studies in other populations, analysis in Mexican and Latin American individuals identified SLC16A11 as a novel candidate gene for type 2 diabetes with a possible role in triacylglycerol metabolism.

Cover page: Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico

Article
Peer Reviewed

Comprehensive genomic characterization defines human glioblastoma genes and core pathways

UC San Francisco Previously Published Works (2008)

Human cancer cells typically harbour multiple chromosomal aberrations, nucleotide substitutions and epigenetic modifications that drive malignant transformation. The Cancer Genome Atlas (TCGA) pilot project aims to assess the value of large-scale multi-dimensional analysis of these molecular characteristics in human cancer and to provide the data rapidly to the research community. Here we report the interim integrative analysis of DNA copy number, gene expression and DNA methylation aberrations in 206 glioblastomas--the most common type of adult brain cancer--and nucleotide sequence aberrations in 91 of the 206 glioblastomas. This analysis provides new insights into the roles of ERBB2, NF1 and TP53, uncovers frequent mutations of the phosphatidylinositol-3-OH kinase regulatory subunit gene PIK3R1, and provides a network view of the pathways altered in the development of glioblastoma. Furthermore, integration of mutation, DNA methylation and clinical treatment data reveals a link between MGMT promoter methylation and a hypermutator phenotype consequent to mismatch repair deficiency in treated glioblastomas, an observation with potential clinical implications. Together, these findings establish the feasibility and power of TCGA, demonstrating that it can rapidly expand knowledge of the molecular basis of cancer.

Cover page: Comprehensive genomic characterization defines human glioblastoma genes and core pathways

Article
Peer Reviewed

Rare variants in PPARG with decreased activity in adipocyte differentiation are associated with increased risk of type 2 diabetes

UC San Diego Previously Published Works (2014)

Peroxisome proliferator-activated receptor gamma (PPARG) is a master transcriptional regulator of adipocyte differentiation and a canonical target of antidiabetic thiazolidinedione medications. In rare families, loss-of-function (LOF) mutations in PPARG are known to cosegregate with lipodystrophy and insulin resistance; in the general population, the common P12A variant is associated with a decreased risk of type 2 diabetes (T2D). Whether and how rare variants in PPARG and defects in adipocyte differentiation influence risk of T2D in the general population remains undetermined. By sequencing PPARG in 19,752 T2D cases and controls drawn from multiple studies and ethnic groups, we identified 49 previously unidentified, nonsynonymous PPARG variants (MAF < 0.5%). Considered in aggregate (with or without computational prediction of functional consequence), these rare variants showed no association with T2D (OR = 1.35; P = 0.17). The function of the 49 variants was experimentally tested in a novel high-throughput human adipocyte differentiation assay, and nine were found to have reduced activity in the assay. Carrying any of these nine LOF variants was associated with a substantial increase in risk of T2D (OR = 7.22; P = 0.005). The combination of large-scale DNA sequencing and functional testing in the laboratory reveals that approximately 1 in 1,000 individuals carries a variant in PPARG that reduces function in a human adipocyte differentiation assay and is associated with a substantial risk of T2D.

Cover page: Rare variants in PPARG with decreased activity in adipocyte differentiation are associated with increased risk of type 2 diabetes