April 15th, 2022
Categories: Applications, Data Mining, Networking, Software, User Groups, Visualization, Natural Language Processing, Visual Analytics, Visual Informatics, Deep Learning, Machine Learning, Data Science, Artificial Intelligence, High Performance Computing
In today’s Big Data era, data scientists require modern workflows to quickly analyze large-scale datasets using complex codes to maintain the rate of scientific progress. These scientists often rely on available campus resources or off-the-shelf computational systems for their applications. Unified infrastructure or over-provisioned servers can quickly become
bottlenecks for specific tasks, wasting time and resources. Composable infrastructure helps solve these problems by providing users with new ways to increase resource utilization. Composable infrastructure disaggregates a computer’s components – CPU, GPU (accelerators), storage and networking - into fluid pools of resources, but typically relies upon infrastructure engineers to
architect individual machines. Infrastructure is either managed with specialized command-line utilities, user interfaces or specification files. These management models are cumbersome and difficult to incorporate into data-science workflows. We
developed a high-level software API, Composastructure, which, when integrated into modern workflows, can be used by infrastructure engineers as well as data scientists to reorganize composable resources on demand. Composastructure enables infrastructures to be programmable, secure, persistent and reproducible. Our API composes machines, frees resources, supports multi-rack operations, and includes a Python module for Jupyter Notebooks.
Keywords - distributed systems, testbed implementation and deployment, composable infrastructure, deep learning, visualization, infrastructure as code
Chen, Z., Renambot, L., Long, L., Brown, M., Johnson, A.E., Moving from Composable to Programmable, April 15th, 2022.