Now showing items 11279-11298 of 20306

    • M3: The Three-Mathematical Minds Model for the Identification of Mathematically Gifted Students

      Maker, June; Sak, Ugur (The University of Arizona., 2005)
      Views of giftedness have evolved from unilateral notions to multilateral conceptions. The primary purpose of this study was to investigate the psychological validity of the three-mathematical minds model (M3) developed by the author. The M3 is based on multilateral conceptions of giftedness to identify mathematically gifted students. Teachings of Poincare and Polya about mathematical ability as well as the theory of successful intelligence proposed by Sternberg (1997) provided the initial framework in the development of the M3. A secondary purpose was to examine the psychological validity of the three-level cognitive complexity model (C3) developed by the author. The C3 is based on studies about expertise to differentiate among gifted, above-average and average-below-average students at three levels.The author developed a test of mathematical ability based on the M3 and C3 with the collaboration of mathematicians. The test was administered to 291 middle school students from four different schools. The reliability analysis indicated that the M3 had a .72 coefficient as a consistency of scores. Exploratory factor analysis yielded three separate components explaining 55% of the total variance. The convergent validity analysis showed that the M3 had medium to high-medium correlations with teachers' ratings of students' mathematical ability (r = .45) and students' ratings of their own ability (r = .36) and their liking of mathematics (r = .35). Item-subtest-total score correlations ranged from low to high. Some M3 items were found to be homogenous measuring only one aspect of mathematical ability, such as creative mathematical ability, whereas some items were found to be good measures of more than one facet of mathematical ability.The C3 accounted for 41% of variance in item difficulty (R square = .408, p < .001). Item difficulty ranged from .02 to .93 with a mean of .29. The analysis of the discrimination power of the three levels of the C3 revealed that level-two and level-three problems differentiated significantly among three ability levels, but level-one problems did not differentiate between gifted and above average students. The findings provide partial evidence for the psychological validity of both the M3 and C3 for the identification of mathematically gifted students.
    • Machine Learning and Additive Manufacturing Based Antenna Design Techniques

      Xin, Hao; Sharma, Yashika; Dvorak, Steven L.; Roveda, Janet Meiling; Zhang, Hao Helen (The University of Arizona., 2020)
      This dissertation investigates the application of machine learning (ML) techniques to additive manufacturing (AM) technology with the ultimate goal of tackling the universal antenna design challenges and achieving automated antenna design for a broad range of applications. First, we investigate the implementation and accuracy of few modern machine learning techniques including, least absolute shrinkage and selection operator (lasso), artificial neural networks (ANN) and k-nearest neighbor (kNN) methods, for antenna design optimization for antennas. The automated techniques provide an efficient, flexible, and reliable framework to identify optimal design parameters for a reference dual-band double T-shaped monopole antenna to achieve favorite performance in terms of its dual bandwidth. We first provide a brief background for these techniques and then explain how these techniques can be used to optimize the performance of the referenced antenna. Then the accuracy of these techniques is tested by doing a comparative analysis with HFSS simulations as well. After obtaining encouraging results from the primitive work mentioned above, we implement ML techniques for the optimization of a more complex 3D-printed slotted waveguide antenna. The design has more design parameters that are be tuned and, also multiple performance parameters, including bandwidth, realized gain, sidelobe level, and back lobe level, are optimized. This is a higher-dimensional and non-linear problem. Hence, we use an artificial neural network for this work. Next, we demonstrate the advantages and challenges of using ML techniques compared to heuristic optimization techniques. We apply ML techniques first for ‘modeling’ that refers to prediction of the performance curve (e.g., reflection coefficient w.r.t. frequency, gain plots in a given plane, etc.) for a given design of antenna with particular set of design parameters and then use it for obtaining ‘optimization’ results that refers to searching the value of the design parameters that can give optimized results for a particular goal (e.g., specific frequency band of operation, maximum gain, minimum sidelobe level, etc.). To explain modeling using ML-techniques, we use two antenna examples in this work, first is the modeling of the reflection coefficient curve with respect to frequency for a planar patch antenna when its shape changes from square to circular and second is the modeling of gain response of a monopole antenna when it is loaded with 3D-printed dielectric material. To explain the optimization process, we use the behavioral model obtained in the second antenna example, and find the design parameter values that are capable of providing single-beam, and multiple-beam radiation. The performance of ML is compared with a heuristic technique like genetic algorithm for this work and the benefits of using ML over GA are mentioned in this work. One of the prototypes that can provide a 3-beam radiation pattern is manufactured and its fabrication process and measurement results are also presented in this work. The ultimate goal of this research work is to overcome universal antenna design challenges and achieving automated antenna design for a broad range of applications. With this work, ML models are built to find the relationship between design parameters and antenna performance parameters analytically, thus requiring only analytical calculations instead of time-consuming numerical simulations for different design goals. This is useful for applications such as IoT, which involve a large number of antenna designs with different goals and constraints. ML techniques help build such behavioral models for antennas automatically from data which is beneficial for fully exploring the vast design degrees of freedom offered by AM.
    • Machine Learning and Deep Phenotyping Towards Predictive Analytics and Therapeutic Strategy in Cardiac Surgery

      Konhilas, John P.; Skaria, Rinku; Runyan, Raymond B.; Antin, Parker B.; Langlais, Paul R.; Churko, Jared (The University of Arizona., 2020)
      Introduction: Myocardial infarction (MI) secondary to coronary artery disease (CAD) remains the most common cause of heart failure (HF), costing over $30 billion in healthcare costs. Although early revascularization is the most effective therapy to restore blood flow and salvage myocardium, to date, there are no available treatments to attenuate ischemia-reperfusion injury (IRI). Moreover, post-operative atrial fibrillation (POAF) continues to be a devastating complication following cardiac surgery, affecting 25-40% CABG and 30-40% valve patients. Human placental amniotic (HPA) tissue is known to have anti-inflammatory and wound healing properties and therefore may promote anti-arrhythmic and cardioprotective effects in patients undergoing cardiac surgery. The central hypothesis of this study is the use of predictive modeling in conjunction with HPA application improves cardioprotection against IRI and POAF following cardiac surgery. Methods: We developed predictive models for POAF using machine learning to characterize 340,860 isolated CABG patients from 2014 to 2017 from the national Society of Thoracic Surgeons database. The support-vector machine (SVM) models were assessed based on accuracy, sensitivity, and specificity, and the neural network (NN) model was compared to the currently utilized CHA2DS2-VASc score. Additionally, using a clinically relevant model of IRI, we performed an unbiased, non-hypothesis driven transcriptome and proteome analysis to elucidate cellular and molecular mechanisms of HPA xenograft-induced cardioprotection against IRI. Swine (n=3 in MI only and MI+HPA groups) were subjected to a 45-minute percutaneous IRI protocol followed by HPA placement in the treated group. Cardiac function was assessed, and tissue samples were collected post-operative day 14. Results were further supported by histology, RT-PCR, and Western blot analyses. Lastly, a retrospective study of 78 isolated CABG and 47 isolated valve patients were evaluated to determine if HPA use on the epicardial surface decreases incidence of POAF. Results: Predictive modeling using neural networks demonstrated to outperform the CHA2DS2-VASc score in predicting POAF in CABG patients. Second, we present the first comprehensive transcriptome and proteome profiles of the ischemic, border, and remote myocardium during the proliferative cardiac repair phase with HPA allograft use in swine. Our results establish HPA limited the extent of cardiac injury by 50% and preserved cardiac function. Spatial dynamic responses, as well as coordinated immune and extracellular matrix remodeling to mitigate injury, were among the key findings. Changes in protein secretion, mitochondrial bioenergetics, and inflammatory responses were also noted to contribute to cardioprotection. Third, peri-operative HPA allograft placement has demonstrated a strong reduction in the incidence of POAF following CABG and valve surgery. Discussion: We provide convincing evidence that HPA has beneficial effects on injured myocardium and POAF and can serve as a new therapeutic strategy in cardiac patients. Additionally, we were also able to demonstrate predictive modeling using machine learning holds promise in improving the incidence of POAF in cardiac surgery patients.
    • Machine Learning Enhanced Quality of Transmission Estimation in Disaggregated Optical Systems

      Kilper, Daniel C.; Zhu, Shengxiang; Djordjevic, Ivan; Lazos, Loukas (The University of Arizona., 2020)
      Telecommunication systems have been through continuous evolution to keep up the fast-growing network traffic demand. With developments such as HD video streaming, cloud computing, Internet of Things (IoT), hyper-scale data centers, and 5G wireless networks, more demanding network requirements raise challenges to create a more efficient optical communication system that can provide the capability to support a wide range of applications. Specifically, 5G standards require more bandwidth and ultra-low latency; metro-scale optical aggregation networks motivate more scalable and on-demand optical network capacity. Dynamic reconfigurable optical add-drop multiplexer (ROADM) based wavelength-division multiplexing (WDM) systems in which connections are established through real-time wavelength switching operations have long been studied as a means of achieving greater scalability and increasing the network resource utilization. A new dimension referred to as disaggregated optical systems, has the potential to further drive down cost by commoditizing the hardware. While ROADMs are extensively deployed in today’s WDM systems, their interoperability and functionality remain limited. Recent advances in hardware and software such as optical physical layer software-defined networking (SDN) significantly improve the multi-layer control and management potential of ROADM systems even facilitating wavelength switching. However, ensuring stable performance and reliable quality of transmission (QoT) remain severe problems, particularly for disaggregated systems. A key challenge in enabling disaggregated optical systems is the uncertainty and optical power dynamics that arise from a variety of physical effects in the amplifiers and transmission fiber. This thesis examines the potential for machine learning for QoT estimation in software defined networking control, and its application to disaggregated ROADM systems. Current physical layer control of flexible meshed optical networks with dynamic reconfigurability is reviewed, and future network control plane architectures based on disaggregated optical systems are discussed. To enable high capacity and low latency in inter-domain routing, a transparent software defined exchange (tSDX) is proposed and implemented. To serve a broadening range of applications and increase network efficiency, a novel transmission system based on hysteresis controlled adaptive coding is studied, which can adapt to diverse and changing transmission conditions, including optical signal-to-noise (OSNR) variations. To resolve optical channel power excursions caused by wavelength operation in optically amplified networks, the dual laser switching technique is proposed and experimentally verified, which is able to cancel out the excursion. To build an accurate numerical model for an optical amplifier, which is a critical component in the calculation of the QoT, a novel machine learning (ML) model is studied based on deep neural networks (DNN) and supervised learning. Experimental results demonstrate the superiority of ML-based modeling in prediction accuracy of the optical channel power and gain spectrum of Erbium-Doped Fiber Amplifiers (EDFA). A hybrid machine learning (HML) model, which combines a-priori knowledge (the empirical numerical model) and a-posteriori knowledge (supervised machine learning model) is proposed and realized, which is shown to reduce the training complexity, both in time and space, compared to an analytical or neural network-based model. The potential improvement to the current QoT estimation framework is proposed and analyzed, based on this enhanced EDFA model.
    • Machine Learning for Channel Estimation and Hybrid Beamforming in Millimeter-Wave Wireless Networks

      Bose, Tamal; Tandon, Ravi; Peken, Ture; Ditzler, Gregory; Djordjevic, Ivan (The University of Arizona., 2021)
      The continuous growth of mobile users and high-speed wireless applications drives the demand for using the abundant bandwidth of mmWave (millimeter-wave) frequencies. On one hand, a massive number of antennas can be supported due to small wavelengths of mmWave signals, which allow using antennas with small form factors. On the other hand, the free space path loss increases with the square of the frequency, which implies that the path loss would be severe in mmWave frequencies. Fortunately, one can compensate for the performance degradation due to the path loss by using directional beamforming (BF) along with the high gain large antenna array systems (massive MIMO). This dissertation tackles three distinct problems, namely channel estimation in massive MIMO, signal detection in massive MIMO, and efficient design of hybrid BF algorithms. In the first part of this dissertation, we focus on the effective channel estimation for massive MIMO systems to overcome the pilot contamination problem. We present an adaptive independent component analysis (ICA)-based channel estimation method, which outperforms conventional ICA as well as other conventional methods for channel estimation. We also make use of compressive sensing (CS) methods for channel estimation and show the advantages in terms of channel estimation accuracy and complexity. In the second part of this dissertation, we consider the problem of signal detection specifically focusing on the scenarios when non-Gaussian signals need to be detected and the receiver may be equipped with a large number of antennas. We show that for the case of non-Gaussian signal detection it turns out the conventional Neyman-Pearson (NP) detector does not perform well for the low signal-to-noise-ratio (SNR) regime. Motivated by this, we propose a bispectrum detector, which is able to better detect the corresponding non-Gaussian information in the signal. We also present the theoretical analysis for the asymptotic behavior of Probability of False Alarm and Probability of Detection. We show the performance of signal detection (for both Gaussian and non-Gaussian signals) as a function of the number of antennas and sampling rate. We also obtain the scaling behavior of the performance in the massive antenna regime. The third part of this dissertation covers the efficient design of hybrid BF algorithms with a specific focus on massive MIMO systems in mmWave networks. The key challenge in the design of hybrid BF algorithms in such networks is that the computational complexity can be prohibitive. We start by focusing on the fundamental approach of finding BF solutions through singular value decomposition (SVD) and explore the role of ML techniques to perform SVD. The first part of this contribution focuses on the data-driven approach to SVD. We propose three deep neural network (DNN) architectures to approximate the SVD, with varying levels of complexity. The methodology for training these DNN architectures is inspired by the fundamental property of SVD, i.e., it can be used to obtain low-rank approximations. We next explicitly take the constraints of hybrid BF into account (such as quantized phase shifters, power constraints), and propose a novel DNN-based approach for the design of hybrid BF systems. Our results show that DNNs can be an attractive and efficient solution for both estimating the SVD as well as hybrid beamformers. Furthermore, we provide time complexity and memory requirement analyses for the proposed DNN-based and state-of-the-art hybrid BF approaches. We then propose a novel reinforcement learning-based hybrid BF algorithm that applies Q-learning in a supervised manner. We analyze the computational complexity of our algorithm as a function of iteration steps and show that a significant reduction in computational complexity is achieved compared to the exhaustive search. In addition to exploring supervised approaches, in the remaining part of this contribution we also explore unsupervised methods for SVD and hybrid BF. These methods are particularly attractive for scenarios when channel conditions change too fast and we may not have a pre-existing dataset of channels and the corresponding optimal BF solutions, which are required for supervised learning. For unsupervised learning, we explore two techniques namely autoencoders and generative adversarial networks (GANs) for both the SVD and hybrid BF. We first propose a linear autoencoder-based approach for the SVD, and then provide a linear autoencoder-based hybrid BF algorithm, which incorporates the constraints of the hybrid BF. In the last part of this contribution, we focus on two different generative models: variational autoencoders (VAEs) and GANs to reduce the number of training iterations compared to the linear autoencoder-based approach. We first propose VAE and Wasserstein GAN (WGAN) based algorithms for the SVD. We then present a VAE and a novel GAN architecture to find the hybrid BF solutions.
    • Machine Learning Methods for Articulatory Data

      Archangeli, Diana B.; Fasel, Ian R.; Berry, Jeffrey James; Bever, Thomas G.; Morrison, Clayton T.; Chan, Erwin; Archangeli, Diana B.; Fasel, Ian R. (The University of Arizona., 2012)
      Humans make use of more than just the audio signal to perceive speech. Behavioral and neurological research has shown that a person's knowledge of how speech is produced influences what is perceived. With methods for collecting articulatory data becoming more ubiquitous, methods for extracting useful information are needed to make this data useful to speech scientists, and for speech technology applications. This dissertation presents feature extraction methods for ultrasound images of the tongue and for data collected with an Electro-Magnetic Articulograph (EMA). The usefulness of these features is tested in several phoneme classification tasks. Feature extraction methods for ultrasound tongue images presented here consist of automatically tracing the tongue surface contour using a modified Deep Belief Network (DBN) (Hinton et al. 2006), and methods inspired by research in face recognition which use the entire image. The tongue tracing method consists of training a DBN as an autoencoder on concatenated images and traces, and then retraining the first two layers to accept only the image at runtime. This 'translational' DBN (tDBN) method is shown to produce traces comparable to those made by human experts. An iterative bootstrapping procedure is presented for using the tDBN to assist a human expert in labeling a new data set. Tongue contour traces are compared with the Eigentongues method of (Hueber et al. 2007), and a Gabor Jet representation in a 6-class phoneme classification task using Support Vector Classifiers (SVC), with Gabor Jets performing the best. These SVC methods are compared to a tDBN classifier, which extracts features from raw images and classifies them with accuracy only slightly lower than the Gabor Jet SVC method.For EMA data, supervised binary SVC feature detectors are trained for each feature in three versions of Distinctive Feature Theory (DFT): Preliminaries (Jakobson et al. 1954), The Sound Pattern of English (Chomsky and Halle 1968), and Unified Feature Theory (Clements and Hume 1995). Each of these feature sets, together with a fourth unsupervised feature set learned using Independent Components Analysis (ICA), are compared on their usefulness in a 46-class phoneme recognition task. Phoneme recognition is performed using a linear-chain Conditional Random Field (CRF) (Lafferty et al. 2001), which takes advantage of the temporal nature of speech, by looking at observations adjacent in time. Results of the phoneme recognition task show that Unified Feature Theory performs slightly better than the other versions of DFT. Surprisingly, ICA actually performs worse than running the CRF on raw EMA data.
    • Machine Learning Methods for Drug Evaluation and Treatment Assessment

      Roveda, Janet M.; Chen, Siteng; Rozenblit, Jerzy W.; Ditzler, Gregory; Khanna, May (The University of Arizona., 2020)
      Drug preclinical test is a key step in evaluating the profile of drug treatment. Many drug tests have been designed for different diseases. For instance, researchers manually count the number of peristaltic waves of drosophila larvae to conduct the severity of amyotrophic lateral sclerosis (ALS). In other cases, pharmacologists have to count dead cells by visual scoring to assess the performance of chemotherapy treatment. Labeling the mitosis events is a time-consuming task, and thus are prohibitive for large scale drug screenings. Machine learning algorithms have allowed researchers to dramatically increase the throughput of analyzing a large amount of data. However, the current methods require massive ground truth annotations which is labor intensive in biomedical experiments. Approaches with few human interventions remain unexplored. This dissertation focuses on three tasks for drug evaluation and treatment assessment. First, we propose a machine learning method to evaluate the effectiveness of drug for ALS. This method leverages t-Distributed Stochastic Neighbor Embedding (tSNE) and statistical analysis to assess the locomotion behavior of drosophila larvae and compare the difference between groups with and without the testing drug. Second, we designed a first-of-the-kind weakly supervised deep neural network for dead cell detection and counting. Compared with many existing fully supervised approaches, our approach only requires image-level ground truth. We show classification performance compared to general purpose and cell classification networks, and report results for the image-level supervised counting task. Last but not least, we propose a sequence-level supervised neural networks model using convolutional long short-term memory (ConvLSTM) and convolutional layers to detect mitosis events at pixel-and-frame level. By using binary labels, the proposed network is able to localize the cell division spatially and temporally. We have evaluated our method with stem cell time-lapse images. With significantly less amount of ground truth in the training data, our method achieved competitive performance compared with the state-of-art fully supervised mitosis detection methods.
    • Machine Learning Methods for Microarray Data Analysis

      Barnard, Jacobus; Gabbur, Prasad; Barnard, Jacobus; Barnard, Jacobus; Rodriguez, Jeffrey; Hua, Hong (The University of Arizona., 2010)
      Microarrays emerged in the 1990s as a consequence of the efforts to speed up the process of drug discovery. They revolutionized molecular biological research by enabling monitoring of thousands of genes together. Typical microarray experiments measure the expression levels of a large numberof genes on very few tissue samples. The resulting sparsity of data presents major challenges to statistical methods used to perform any kind of analysis on this data. This research posits that phenotypic classification and prediction serve as good objective functions for both optimization and evaluation of microarray data analysis methods. This is because classification measures whatis needed for diagnostics and provides quantitative performance measures such as leave-one-out (LOO) or held-out prediction accuracy and confidence. Under the classification framework, various microarray data normalization procedures are evaluated using a class label hypothesis testing framework and also employing Support Vector Machines (SVM) and linear discriminant based classifiers. A novel normalization technique based on minimizing the squared correlation coefficients between expression levels of gene pairs is proposed and evaluated along with the other methods. Our results suggest that most normalization methods helped classification on the datasets considered except the rank method, most likely due to its quantization effects.Another contribution of this research is in developing machine learning methods for incorporating an independent source of information, in the form of gene annotations, to analyze microarray data. Recently, genes of many organisms have been annotated with terms from a limited vocabulary called Gene Ontologies (GO), describing the genes' roles in various biological processes, molecular functions and their locations within the cell. Novel probabilistic generative models are proposed for clustering genes using both their expression levels and GO tags. These models are similar in essence to the ones used for multimodal data, such as images and words, with learning and inference done in a Bayesian framework. The multimodal generative models are used for phenotypic class prediction. More specifically, the problems of phenotype prediction for static gene expression data and state prediction for time-course data are emphasized. Using GO tags for organisms whose genes have been studied more comprehensively leads to an improvement in prediction. Our methods also have the potential to provide a way to assess the quality of available GO tags for the genes of various model organisms.
    • Machine Learning Multi-Stage Classification and Regression in the Search for Vector-like Quarks and the Neyman Construction in Signal Searches

      Cheu, Elliot C.; Leone, Robert Matthew; Cheu, Elliott C.; Johns, Kenneth A.; Varnes, Erich W.; Fleming, Sean P. (The University of Arizona., 2016)
      A search for vector-like quarks (VLQs) decaying to a Z boson using multi-stage machine learning was compared to a search using a standard square cuts search strategy. VLQs are predicted by several new theories beyond the Standard Model. The searches used 20.3 inverse femtobarns of proton-proton collisions at a center-of-mass energy of 8 TeV collected with the ATLAS detector in 2012 at the CERN Large Hadron Collider. CLs upper limits on production cross sections of vector-like top and bottom quarks were computed for VLQs produced singly or in pairs, Tsingle, Bsingle, Tpair, and Bpair. The two stage machine learning classification search strategy did not provide any improvement over the standard square cuts strategy, but for Tpair, Bpair, and Tsingle, a third stage of machine learning regression was able to lower the upper limits of high signal masses by as much as 50%. Additionally, new test statistics were developed for use in the Neyman construction of confidence regions in order to address deficiencies in current frequentist methods, such as the generation of empty set confidence intervals. A new method for treating nuisance parameters was also developed that may provide better coverage properties than current methods used in particle searches. Finally, significance ratio functions were derived that allow a more nuanced interpretation of the evidence provided by measurements than is given by confidence intervals alone.
    • Machine Learning, Optimization, and Anti-Training with Sacrificial Data

      Rozenbilt, Jerzy W.; Head, Kenneth L.; Valenzuela, Michael Lawrence; Rozenbilt, Jerzy W.; Head, Kenneth L.; Lysecky, Roman L.; Marcellin, Michael W. (The University of Arizona., 2016)
      Traditionally the machine learning community has viewed the No Free Lunch (NFL) theorems for search and optimization as a limitation. I review, analyze, and unify the NFL theorem with the many frameworks to arrive at necessary conditions for improving black-box optimization, model selection, and machine learning in general. I review meta-learning literature to determine when and how meta-learning can benefit machine learning. We generalize meta-learning, in context of the NFL theorems, to arrive at a novel technique called Anti-Training with Sacrificial Data (ATSD). My technique applies at the meta level to arrive at domain specific algorithms and models. I also show how to generate sacrificial data. An extensive case study is presented along with simulated annealing results to demonstrate the efficacy of the ATSD method.
    • Machine Learning-based Author Identification for Social Media Forensics

      Hariri, Salim; Shao, Sicong; Ditzler, Gregory; Akoglu, Ali (The University of Arizona., 2021)
      Social media have gained extreme popularity due to the explosive growth of cyberinfrastructures, mobile devices, Internet technologies, and services. However, they also provide potential anonymity, which in turn harbors hacker forums, carding shops, underground marketplace, dark websites, and so on. As a result, social media have become the playground of cyber threat actors who conduct various malicious operations such as selling stolen cards, disseminating misinformation, propagating hacking tools, spreading malware samples, planning cyberattacks, and organizing trolling campaigns. Therefore, it is urgent to study effective methods that can identify the authors behind the digital text in order to enable forensic analysis, enhance security, and reduce social media misuse. In recent years, machine learning-based author identification has become a promising solution to identify the author of text. However, it is still an underexplored research field in social media forensics. This thesis investigates machine learning-based author identification subfields, including author attribution, author verification, author clustering, and their applications to social media forensics. Internet Relay Chat (IRC) has traditionally been used for legitimate purposes. Yet, cyber threat actors extensively abuse it to generate a wide range of illegal content and perform malicious behaviors due to its potential anonymity and popularity among hackers. Unfortunately, author identification research in IRC remains a largely underexplored area. In this thesis, we first present our automatic social media monitoring and threat detection method that can effectively collect data for author identification tasks and then present a novel author attribution framework and its application to IRC. It consists of a holistic feature extraction model and an ensemble of ensembles for multi-class classification. We then bring a novel author verification framework under the principle of one-class learning to effectively verify the authorship of IRC texts. This research also examines author clustering for social media forensics. Most author identification studies focus on author attribution and author verification, while the author clustering research is largely ignored. Meanwhile, cyber threat actors widely make use of Twitter to create alias accounts for numerous malicious purposes, especially in trolling campaigns and misinformation propagations. Thus, developing an effective author clustering method for Twitter is urgent. In this research, we developed a novel unsupervised learning-based author clustering framework and its application to Twitter. We delivered the capability to identify the group among many Twitter aliases even without prior knowledge of the number of authors. We address the effectiveness and demonstrate the feasibility of our author identification frameworks through diverse experiments. Our author attribution approach can achieve more than 90% attribution accuracy given hundreds of candidates in the author attribution experiments. In the author verification experiments, over 70% of author cases, our author verification approach can achieve more than 99% AUC. In the author clustering experiments given more than one hundred unlabeled text samples, our author clustering approach attains an average accuracy of 81.93% when knowing the number of authors and an average accuracy of 74.78% without prior knowledge of the number of authors.
    • Machine Reading for Scientific Discovery

      Fong, Sandiway; Surdeanu, Mihai; Hahn-Powell, Gus; Fong, Sandiway; Surdeanu, Mihai; Morrison, Clayton (The University of Arizona., 2018)
      The aim of this work is to accelerate scientific discovery by advancing machine reading approaches designed to extract claims and assertions made in the literature, assemble these statements into cohesive models, and generate novel hypotheses that synthesize findings from isolated research communities. Over 1 million new publications are added to the biomedical literature each year. This poses a serious challenge to researchers needing to understand the state of the field. It is effectively impossible for an individual to summarize the larger body of work or even remain abreast of research findings directly relevant to a subtopic. As the boundaries between disciplines continue to blur, the question of what to read grows more complicated. Researchers must inevitably turn to machine reading techniques to summarize findings, detect contradictions, and illuminate the inner workings of complex systems. Machine reading is a research program in artificial intelligence centered on teaching computers to read and comprehend natural language text. Through large-scale machine reading of the scientific literature, we can greatly advance our understanding of the natural world. Despite remarkable progress (Gunning et al., 2010; Berant et al., 2014; Cohen, 2015a), current machine reading systems face two major obstacles which impede wider adoption: <i>Assembly</i> The majority of machine reading systems extract disconnected findings from the literature (Berant et al., 2014). In areas of study such as biology, which involve large mechanistic systems with many interdependent components, it is essential that the insights scattered across the literature be contextualized and carefully integrated. The single greatest challenge facing machine reading is in learning to piece together this intricate puzzle to form coherent models and mitigate information overload. In this work, I will demonstrate how disparate biomolecular statements mined from text can be causally ordered into chains of reactions (Hahn-Powell et al., 2016b) that extend our understanding of mechanistic biology. Then, moving beyond a single domain, we will see how machine-read fragments (influence relations) drawn from a multitude of disciplines can be assembled into models of children’s heath. <i>Hypothesis generation and “undiscovered public knowledge”</i> (Swanson, 1986a) Without a notion of research communities and their interaction, machine reading systems struggle to identify knowledge gaps and key ideas capable of bridging disciplines and fostering the kind of collaboration that accelerates scientific progress. With this aim in mind, I introduce a procedure for detecting research communities using a large citation network and derive semantic representations that encode a measure of the flow of information between these groups. Finally, I leverage these representations to uncover influence relation pathways which connect otherwise isolated communities.
    • Macro- and Micro-Scale Geoarchaeology of Ucagizli Caves I and II, Hatay, Turkey

      Stiner, Mary C.; Holliday, Vance T.; Mentzer, Susan Marie; Goldberg, Paul; Kuhn, Steven L.; Quade, Jay; Stiner, Mary C.; Holliday, Vance T. (The University of Arizona., 2011)
      This project documents the multi-scalar formation processes of two northern Levantine coastal Paleolithic cave sites using field geology, archaeological micromorphology and sediment geochemistry. Located in within several hundred meters of each other, the sequences from Üçağızlı I and II present an opportunity to compare late Middle and early Upper Paleolithic hominin adaptations to a similar coastal environment. The morphologies of the sites and the suite of coastal geomorphic features available to the area's Paleolithic occupants were impacted by fluctuations in sea level as well as tectonic events. The sites share similar formation histories that include active karstic processes, marine inundation, occupation by hominins, partial collapse of the cave vaults, and erosion of the uppermost archaeological deposits. Mousterian occupation of Üçağızlı II began after the formation of a series of stable sea level features that date to Marine Isotope Stage (MIS) 5a. Hominin utilization of the highly eroded portions of the cave continued at least through the middle of MIS 3, although the cultural attribution of the youngest materials is presently unknown. Üçağızlı I contains a sequence of Initial Upper Paleolithic, Ahmarian and Epipaleolithic materials dating to MIS 3 and 2. Micromorphology of the archaeological sediments reveals strong anthropogenic contributions to the infilling of both caves, in particular the deposition of abundant, well-preserved wood ashes. In both sequences, post-depositional insect bioturbation has negatively impacted the combustion features, resulting in alteration of the original sedimentary fabrics and loss of information regarding hominin activities such as sweeping, rake-out and dumping of ashes. In Üçağızlı II, the dominant mode of sedimentation is anthropogenic; a series of intact and cemented combustion features located beneath the highest point of the cave ceiling is surrounded by sediment exhibiting evidence of both rodent and insect bioturbation. In Üçağızlı I, phases of human activity alternated with periods of natural sedimentation. Combustion features in the site include isolated hearths, stacks of hearths, rake-out or sweeping deposits, ash dumps, and mixed burned materials that have been impacted by colluvial reworking and bioturbation. In sum, the two sites contain similar types of anthropogenic sediments despite differing cultural affiliation.

      Simpson, Phillip Michael, 1943- (The University of Arizona., 1971)
    • Macroecology: Going from patterns to processes, a theory and its test

      Rosenzweig, Michael L.; McGill, Brian James (The University of Arizona., 2003)
      This dissertation focuses on two patterns in macroecology. The first describes the distribution of abundances between species (SAD) within a single community. The second describes the structure of abundance across a species range (SAASR). The central result is that the SAASR, combined with some other assumptions, can be shown both theoretically and empirically to explain the SAD (as well as several other patterns such as the species area relationship or SPAR). Given the increased importance of the SAASR pattern, I then provide an extensive analysis of empirical data to test for the existence and exact nature of the SAASR as well as developing the first quantitative assessments of proposed mechanisms underlying the SAASR. I also clarify a current point of confusion about SADs: whether they are truly log left-skewed. I next present a philosophy of science paper on how best to test macroecological theories. Finally, I apply this approach to a well-known macroecological theory that is generally considered to be strongly tested and show that the existing tests are, in fact, weak.

      Spencer, John W. (John William), 1940- (The University of Arizona., 1972)
    • Macrophage response to polymeric vascular grafts

      Williams, Stuart K.; Salzmann, Dennis Lee, 1970- (The University of Arizona., 1997)
      The use of materials for replacement or repair of biological tissue and organs has been attempted for thousands of years. Regardless of material used or site of implantation all biomedical materials elicit a foreign body response by the host characterized by the presence of macrophages and foreign body giant cells with the polymer for the duration of the implant. This inflammatory response is believed to be responsible for the lack of biocompatibility of implanted materials. Furthermore, each type of biomedical device suffers from specific problems that may lead to the ultimate failure of the implant. Synthetic polymeric vascular grafts fail primarily due to the inherent thrombogenecity of the material and anastomotic neointimal thickening. In an attempt to create a non-thrombogenic lining on the blood contacting surface of vascular implants, the promotion of an endothelial lining on the luminal surface of vascular grafts has been investigated. This can be accomplished by both artificial and natural mechanisms. Regardless, it is believed that the inflammatory response elicited by the implant influences the angiogenic mechanisms and neointimal thickening associated with the implant. The relationship between inflammation and angiogenesis associated with biomedical implants remains to be delineated. Studies in this dissertation attempt to determine this relationship by examining the inflammatory response and inflammatory cytokines released by cells associated with polymeric implants and how these bioactive molecules influence the angiogenic response. Furthermore, an advancing technology in vascular repair, endovascular grafts, was tested in two vascular models to assess the general healing characteristics, inflammatory response and the formation of blood vessels associated with the device. The results from these studies suggest that the inflammatory response plays a fundamental role in the formation of blood vessels around polymeric implants and neointimal thickening on the luminal surface of vascular implants. From these experiments a greater understanding of the healing response associated with vascular grafts has resulted.
    • Macroscopic lattice dynamics.

      Miller, Peter David. (The University of Arizona., 1994)
      The modulational behavior of exact oscillatory solutions to a family of non-linear systems of coupled differential equations is studied both numerically and analytically. The family of lattice systems investigated has applications ranging from theoretical biology to numerical methods. The goal is to obtain a description, given by a system of partial differential equations valid on long spatial and temporal scales, of the microscopic vibrations in the lattice. A theory of simple harmonic plane wave modulation is given for the entire family of microscopic systems, and the structure of the corresponding modulation equations is analyzed; particular utility is gained by casting the modulation equations in Riemann invariant form. Although difficulties are encountered in extending this theory to more complicated oscillatory modes in general, the special case of the integrable Ablowitz-Ladik system allows the program of describing more complicated modulated oscillations to be carried out virtually to completion. An infinite hierarchy of multiphase wavetrain solutions to these equations is obtained exactly using methods of algebraic geometry, and the complete set of equations describing the modulational behavior of each kind of multiphase wavetrain is written down using the same machinery. The distinguishing features of modulation theory in the presence of resonance are described, and an unusual set of modulation equations is derived in this case. The results of this dissertation can be interpreted in the context of nonequilibrium thermodynamics of regular oscillations in nonlinear lattices; instabilities in the modulation equations correspond to predictable phase transitions.
    • Mad Mark Twain: Rage and Rhetoric in the Life and Works of Samuel L. Clemens

      Jenkins, Jennifer L.; Fredericks, Sarah Elizabeth; Hurh, Paul; Abraham, Matthew (The University of Arizona., 2020)
      Interweaving literary biography, rhetoric, and emotion studies, this dissertation argues that anger was fundamental to Mark Twain’s social and literary epistemologies. Although scholars have largely dismissed his temper as anecdotal, Twain considered anger vital to maintaining social order and strategically employed angry rhetoric in his personal and professional writings. Neither irrational nor haphazard, Twain’s vitriol demonstrates remarkable rhetorical awareness and literary artistry. Whether haranguing his publishers about dwindling profits or eviscerating his private secretary Isabel Lyon in the little-known Ashcroft-Lyon Manuscript, Twain weaponized his emotions utilizing classical Aristotelian theories of persuasion. Moreover, many defining literary tropes of Twain’s most celebrated works originated in these angry texts, further cementing their importance to his literary development. Through close reading of his newspaper articles, letters, and autobiographical texts, this study traces evolving rhetorical patterns in Twain’s vituperation and demonstrates how his anger script impacted his participation in nineteenth-century literary culture.

      Knapp, Judith Poole (The University of Arizona., 1980)
      Color and light, consistent with most visual phenomena in Madame Bovary, are more than mere descriptive tools: they actually serve as vehicles for Flaubert's characteristic use of symbolism. When taken cumulatively throughout the novel, the meanings ascribed to certain color and lighting effects often symbolize specific situations or a character's psychology, while at the same time reflecting a particular point of view. This dissertation initially examines the questions of point of view, major themes and Emma's psychology. Though most of the novel is recounted by an omniscient third-person narrator, he frequently takes a back seat so that Emma's point of view, for one, becomes the dominant manner of presentation. By shifting from one point of view to another, the narrator presents us with much conflicting symbolism--are we witnessing a scene and its color and light through Emma's dreamy gaze or perhaps in a more objective light shed by the narrator? An additional source of conflict is to be found in Emma's psychology and the major themes of Madame Bovary, as they both center around the heroine's inability to distinguish dreams from reality, with reality eventually gaining the upper hand and crushing Emma's dream world. Color and light symbolism naturally mirror all of these conflicts, with positive symbols often overshadowed by negative ones. There are three basic types of illumination present in the novel--(1) dim light reflecting Emma's romantic nature; (2) harsh, revealing brightness which, in the present, sheds light on an all-too pervasive reality; and (3) a lack of illumination emphasizing Emma's depression and leading ultimately to the utter darkness of death. Seven individual colors are explored for their symbolic aspects: blue, white, yellow, black, red, pale, and green. Blue symbolizes Emma's dreams and aspirations, her desire to attain an always nebulous higher state of being, which of course she will never reach. White can at times be interpreted along classical lines as representing innocence, naivete, and potential, or conversely emptiness and ennui, as in the case of this same potential remaining unfulfilled. Yellow signifies reality which is always ready to engulf Emma and her dreams and is seen as yellowing the whiteness of her potential. Black takes on several symbolic connotations, usually dependent upon the point of view of the person lending it symbolic value. It can be seen as a reflection of the Church, of mystery, or, for Emma, of the perfect romantic hero who must dress in black. As the narrator is aware, however, and communicates to the reader, all meanings of black in the novel merely culminate in its traditional connotation, that of death, in this case, Emma's of course. Red is another shade which can be divided into positive and negative aspects, with the positive signifying sensuality, voluptuousness, and by extension a certain erotic vision of love. On the negative side, we find many characteristics of red that Emma herself would consider disagreeable: a peasant origin, outlook or attitude, and a lack of sophistication sometimes coupled with crudeness or insensitivity. One or more of three basic meanings can be ascribed to pale in any given context; it can represent a dull uninteresting existence, a romantic ideal--for Emma--, or merely a pallor caused by illness or indisposition. Green, the final hue treated, is a secondary color on the artist's palette combining the blue of dreams and the yellow of reality, thus crating a feeling of malediction for Emma and a fatal mixture, since one cannot survive in the face of the other. In the end, Emma is forced to recognize the reality which had been so clearly illuminated throughout the novel by the narrator and, unable to face the light, she ironically turns instead to the total darkness of death.