My primary research interest lies in the area of high-performance computing, energy-efficient storage systems, parallel/distributed systems, and security-aware scheduling. In particular, I have been investigating efficient scheduling schemes and resource management strategies to support high performance applications running on dedicated and non-dedicated clusters, which consist of off-the-shelf hardware and software components. After joining SDSU, I concentrated on high-performance/highly reliable storage systems and energy-efficient computing. My work on these topics focuses on the following research problems: data placement, data redistribution, data reconstruction, the relationship between energy-saving techniques and disk reliability, and flash-based mobile disk arrays. In what follows, I summarize my past and current research activities.
A Device-Array Based Flash Storage System for Emerging Data-Intensive and Mission-Critical Mobile Applications: from Architecture Redesign to New File System
(single PI, funded by the National Science Foundation under grant CNS-1320738, $440,727, 10/2013 ~ 09/2016)
Flash memory manufactures are aggressively scaling up NAND flash density in order to increase capacity and reduce the cost per gigabyte using either MLC (multi-level cell) or TLC (triple level cell) technologies. However, other metrics like reliability, endurance, and performance are all declining. As a result, developing a high-performance and highly reliable embedded flash storage system on top of increasingly larger but inferior NAND flash memory devices has become both indispensable and challenging. Utilizing a holistic approach from hardware re-architecting to software redesign, this project designs, implements, and evaluates a new flash storage system for emerging and future data-intensive mobile applications such as wireless healthcare and live sport broadcast. In particular, this project will replace the existing single-device hardware organization with a multiple-device array architecture. Next, this project will develop a new flash file system that can access multiple flash memory devices in parallel. Also, a wide spectrum of new techniques including garbage collection method, wear levelling mechanism, ECC protection, and data recovery scheme will be developed. Finally, a hardware prototype that can empirically evaluate the new flash device array architecture and all software modules will be built. The new flash file system including source code and documents, all new techniques, and the hardware prototype, the outcomes of this project, will be released to the public. This project will also promote teaching, learning, and training by exposing students to technological and scientific underpinnings in the field of mobile storage systems.
CAREER: Architectural Support for Integrating NAND Flash Solid State Disks into Enterprise-Class Storage Systems
(funded by the National Science Foundation under grant CNS-0845105, $436,000, 09/2009 ~ 08/2014)
With recent advances in capacity, bandwidth, and durability, NAND flash memory has been successfully employed in mobile devices like PDAs and laptops and it is starting to replace hard disks in desktop systems. Integrating NAND flash memory into server domain applications, which normally demands a high level of data reliability and exceptional random I/O performance, however, is much more challenging because NAND flash memory exhibits relatively poor random write performance and insufficient reliability due to limited erasure cycles. To address these problems, an architectural support for flash SSD must be devised in order to fundamentally boost its performance and longevity by a software/hardware combined effort. In this project, we will develop a novel flash disk storage architecture that exploits the addition of RAM and dedicated software schemes to incorporate flash SSDs into enterprise-class storage systems. We plan to implement a flash disk array prototype and deploy it in real-world data-intensive application. In addition, we will develop new software techniques such as a double-buffer write ordering management scheme and an inter-disk wear-leveling technique. This project will contribute to energy conservation, performance enhancement, data management, and reliability technology for enterprise-class storage systems by developing the flash disk array storage architecture, accompanied by an array of new software schemes. This project will also promote teaching, learning, an training by exposing both undergraduate and underrepresented students to technological and scientific underpinnings in the field of server-class storage systems.
Energy-Efficient and Reliability-Aware Data Management in Mobile Storage Systems
(single PI, funded by the National Science Foundation under grant CNS-0834466, $160,000, 09/2008 ~ 08/2010)
Highly reliable, high performance and energy-efficient storage systems are essential for mobile data-intensive applications such as remote surgery and mobile data center. Existing mobile storage systems generally consist of an array of independent small form factor hard disks connected to a host by a storage interface in a mobile computing environment. Although hard disks are cost-effective and can provide huge capacity and high-throughput, they have some intrinsic limitations such as long access latencies, high annual disk replacement rates, fragile physical characteristics, and energy-inefficiency. Compared with hard disk drives, flash disks are much more robust and energy-efficient, and can offer much faster access times. A major concern on current flash disk is its relatively higher price. This project develops a hybrid disk array system, which integrates small capacity flash disks with high capacity hard disk drives to form a robust and energy-efficient storage system for mobile data-intensive applications. In particular, an array of new data management techniques including energy-efficient data placement, self-adaptive and reliability-aware data redistribution, and self-triggered data replication for data-intensive mobile applications built on the hybrid disk array framework will be developed. In addition, this project implements a simulation toolkit, which will be designed specifically to study a variety of data management techniques on top of the hybrid disk array architecture. This project will also promote teaching, learning, and training by exposing students to technological and scientific underpinnings in the field of energy-efficient storage systems. To enhance education outreach to local underrepresented groups of undergraduate students, this project organizes a summer workshop on energy-efficient computing at San Diego State University.
BUD: A Buffer-Disk Architecture for Energy Conservation in Parallel Disk Systems
(Co-PI, funded by the National Science Foundation under grant CCF-0742187, the $311,999 grant was awarded to Auburn University with $90,244 being subcontracted to SDSU, 05/2007 ~ 04/2010)
Parallel disks consisting of multiple disks with high-speed switched interconnect are ideal for data-intensive applications running in high-performance computing systems. Improving the energy efficiency of parallel disks is an intrinsic requirement of next generation high-performance computing systems, because a storage subsystem can represent 27% of the energy consumed in a data center. However, it is a major challenge to conserve energy for parallel disks and energy efficiently coordinate I/Os of hundreds or thousands of concurrent disk devices to meet high-performance and energy-saving requirements. This research investigates novel energy conservation techniques to provide significant energy savings while achieving low-cost and high-performance for parallel disks. In this research project, the investigators take an organized approach to implementing energy-saving techniques for parallel disks, simulating energy-efficient parallel disk systems, and conducting a physical demonstration. This research involves four tasks: (1) design and develop a buffer-disk (BUD) architecture to reduce energy dissipation in parallel disk systems; (2) develop innovative energy-saving techniques, including an energy-related reliability model, energy-aware data partitioning, disk request processing, data movement, data placement, prefetching strategies, and power management for buffer disks; (3) implement a simulation toolkit (BUDSIM) used to develop a variety of energy-saving techniques and their integration in the BUD architecture; and (4) validate the BUD architecture along with our innovative energy-conservation techniques using real data-intensive applications running on high-performance clusters. This research can benefit society by developing economically attractive and environmentally friendly parallel disk systems, which are able to lower electricity bills and reduce emissions of air pollutants. Furthermore, the BUD architecture and the energy-conservation techniques can be transferable to embedded disk systems, where power constraints are more severe than conventional disk systems.
1. T. Xie, H. Wang, "MICRO: A Multi-level Caching-based Reconstruction Optimization for Mobile Storage Systems," IEEE Transactions on Computers, Vol. 57, No. 10, pp. 1386-1398, Oct. 2008.
2. T. Xie, "SEA: A Striping-based Energy-aware Strategy for Data Placement in RAID-Structured Storage Systems," IEEE Transactions on Computers, Vol. 57, No. 6, pp. 748-761, June 2008.
3. T. Xie, X. Qin, "An Energy-Delay Tunable Task Allocation Strategy for Collaborative Applications in Networked Embedded Systems," IEEE Transactions on Computers, Vol. 57, No. 3, pp. 329-343, March 2008.
4. T. Xie, D.K. Madathil, “SAIL: Self-Adaptive File Reallocation on Hybrid Disk Arrays," The 15th Annual IEEE International Conference on High Performance Computing (HiPC 2008), Bangalore, India, December 17-20, 2008 (accepted, acceptance rate 14.4%, 46/319).
5. T. Xie, Y. Sun, “PEARL: Performance, Energy, and Reliability Balanced Dynamic Data Redistribution for Next Generation Disk Arrays," The 16th Annual Meeting of the IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), Baltimore, Maryland, USA, September 8-10, 2008 (acceptance rate 38.3%, 36/94).
6. T. Xie, Y. Sun, “Sacrificing Reliability for Energy Saving: Is It Worthwhile for Disk Arrays?" The 22nd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2008), Miami, Florida, USA, April 14-18, 2008 (acceptance rate 25.6%, 105/410).
7. D.K. Madathil, R.B. Thota, P. Paul, and T. Xie “A Static Data Placement Strategy towards Perfect Load-Balancing for Distributed Storage Clusters," The 7th International Workshop on Performance Modeling, Evaluation, and Optimization of Ubiquitous Computing and Networked Systems (PMEO UCNS 2008), in conjunction with the 22nd IEEE International Parallel and Distributed Processing Symposium (IPDPS 2008), Miami, Florida, USA, April 14-18, 2008.
8. T. Xie, Y. Sun, “No More Energy-Performance Trade-Off: A New Data Placement Strategy for RAID-Structured Storage Systems," The 14th Annual IEEE International Conference on High Performance Computing (HiPC 2007), Lecture Notes in Computer Science (LNCS 3834), pp.35-46, Goa, India, December 18-21, 2007 (acceptance rate 20.55%, 52/253).
9. T. Xie, “SOR: A Static File Assignment Strategy Immune to Workload Characteristic Assumptions in Parallel I/O Systems,” The 36th International Conference on Parallel Processing (ICPP 2007), XiAn, China, September 10-14, 2007.
Control Information System
We developed a control information system, which can monitor and control industrial field production processes. The on-site operators only need to use Internet browser such as IE to control devices and production processes. All the control information can be automatically updated regularly and available to the management team. The system was applied in Yangzi Petroleum Company, one of the largest petroleum enterprises in China.
Digital Watermarking for Image Authentication
In this work I presented a novel image authentication scheme by embedding a fragile content-based cryptographic signature into image compression-domain. My scheme can detect even one compression-domain element’s interpolation so that an adversary cannot tamper a watermarked image without being detected.
Static Analyzer for Vicious Executables
To detect obfuscated or polymorphic malware, we present a signature-based malware detection algorithm. The rationale behind our scheme is that all versions of one malware share a common signature. The signature offers us a basis for detecting variants and mutants of the malware in the future.
Security-Aware Scheduling for High Performance Clusters and
Improving quality of security is increasingly becoming an important issue in the design of real-time systems, which are indispensable for conducting business in government, industry, and academic organizations. This project addresses the issue of maximizing quality of security for real-time systems. We aim at developing and validating new mechanisms and schemes for security-aware real-time systems, including dynamic/static scheduling algorithms, security overhead modeling, security level controllers, and security-aware storage resources. These new schemes are expected to deliver high quality of security while meeting timing constraints of real-time systems. The novelty of the research comes not only from the security-aware schemes, but also from an improved methodology for designing and evaluating mechanisms and algorithms that integrate quality of security, scheduling, and resource management into real-time systems. Once the proposed schemes are adopted, our security-aware real-time scheduling algorithms and resource management mechanisms will be included within existing cyber security tools and services.
SAREC: Security-Aware Real-Time Scheduling for Clusters. Over the last ten years, clusters have become the fastest growing platforms in high-performance computing. Security requirements of security-critical real-time applications on clusters must be met in addition to satisfying timing constraints. However, conventional real-time scheduling algorithms ignore the applications’ security requirements. We investigated the problem of scheduling a set of independent real-time tasks running on clusters. In particular, we proposed a security overhead model that is capable of measuring security overheads incurred by security-critical tasks. Further, we proposed a security-aware scheduling strategy, or SAREC, which integrates security requirements into scheduling for real-time applications by employing our security overhead model. To evaluate the effectiveness of SAREC, we implement a security-aware real-time scheduling algorithm (SAREC-EDF), which incorporates the earliest deadline first (EDF) scheduling algorithm into SAREC. Extensive simulation experiments show that SAREC-EDF significantly improves overall system performance over three baseline scheduling algorithms (variations of EDF) by up to 72.55%.
Dynamic Real-Time Scheduling with Security Awareness. An increasing number of real-time applications, such as aircraft control and medical electronics systems, require high quality of security to assure confidentiality, authenticity and integrity of information. However, most existing algorithms for scheduling independent tasks in real-time systems do not adequately consider security requirements of real-time tasks. In recognition of this problem we proposed a novel dynamic scheduling algorithm with security awareness, which is capable of achieving high quality of security for real-time tasks while improving resource utilization. We have conducted extensive simulation experiments to quantitatively evaluate the performance of our approach. Specifically, experimental results show that compared with three heuristic algorithms, the proposed algorithm can consistently improve overall system performance in terms of quality of security and system guarantee ratio under a wide range of workload characteristics.
Enhancing Security of Real-Time Applications on Grids through Scheduling.
Security sensitive real-time applications can take full advantage of a grid environment that allows grid participants to exercise a fine-grained control and allocation of computational resources. However, conventional real-time scheduling algorithms failed to fulfil the security requirements of real-time applications. In this research we proposed a dynamic real-time scheduling algorithm, or SAREG, which is able to enhance quality of security for real-time applications running on Grids. To make SAREG practical, we present a mathematical model to formally describe a scheduling framework, security-sensitive real-time applications, and security overheads. We leverage the model to measure security overheads incurred by an array of security services, including encryption, authentication, integrity check, etc. To evaluate the effectiveness of SAREG, we conducted extensive simulations using a real world trace from a supercomputing centre. Our experimental results show that SAREG significantly improves system performance in terms of quality of security and schedulability over three existing scheduling algorithms.
A New Allocation Scheme for
Parallel Applications with Deadline and Security Constraints on
Clusters. In this paper, we address the issue of allocating tasks of parallel applications on clusters subject to timing and security constraints in addition to precedence relationships. A task allocation scheme, or TAPADS (Task Allocation for Parallel Applications with Deadline and Security Constraints), is developed to find an optimal allocation that maximizes quality of security and the probability of meeting deadlines for parallel applications. In addition, we proposed mathematical models to describe a system framework, parallel applications with deadline and security constraints, and security overheads. Experimental results show that TAPADS significantly improves the performance of clusters in terms of quality of security and schedulability over three existing allocation schemes
1. T. Xie, X. Qin, "Security-Aware Resource Allocation for Real-Time Parallel Jobs on Homogeneous and Heterogeneous Clusters," IEEE Transactions on Parallel and Distributed Systems, Vol. 19, No. 5, pp. 682-697, May 2008.
2. T. Xie, X. Qin,"Stochastic Scheduling for Multiclass Applications with Availability Requirements in Heterogeneous Clusters", Journal of Cluster Computing, Publisher: Springer, ISSN: 1386-7857, Volume 11 , Issue 1, pp. 33-43, March 2008.
3. X. Qin, T. Xie, "An Availability-Aware Task Scheduling Strategy for Heterogeneous Systems," IEEE Transactions on Computers, Vol. 57, No. 2, pp. 188-199, February 2008.
4. T. Xie, X. Qin, "Performance Evaluation of a New Scheduling Algorithm for Distributed Systems with Security Heterogeneity," Journal of Parallel and Distributed Computing, Vol. 67, No. 10, pp. 1067-1081, October 2007.
5. T. Xie, X. Qin, "Improving Security for Periodic Tasks in Embedded Systems through Scheduling," ACM Transactions on Embedded Computing Systems, Vol. 6, Issue 3, Article No. 20, July 2007.
6. T. Xie, X. Qin, “Security-Driven Scheduling for Data-Intensive Applications on Grids,” Journal of Cluster Computing, Special Issue: Evaluation and Optimization of High-Performance Computing and Networking Systems, Guest Editors: Geyong Min and Mohamed Ould-Khaoua, Publisher: Springer, ISSN: 1386-7857, Volume 10, Number 2 / June, pp. 145-153, 2007.
7. M. Nijim, X. Qin, and T. Xie, "Modeling and Improving Security of a Local Disk System for Write-Intensive Workloads," ACM Transactions on Storage, Vol. 2, Issue 4, pp. 400-423, November 2006.
8. T. Xie, X. Qin, "Scheduling Security-Critical Real-Time Applications on Clusters," IEEE Transactions on Computers, vol. 55, no. 7, pp. 864-879, July 2006.
9. T. Xie, X. Qin, A. Sung, M. Lin, and L. Yang, "Real-Time Scheduling with Quality of Security Constraints," International Journal of High Performance Computing and Networking, Vol. 4, Nos. 3/4, pp. 188-197, 2006.
10. T. Xie, X. Qin, and M. Lin, “Open Issues and Challenges in Security-aware Real-Time Scheduling for Distributed Systems,” Journal on Information, Special Issue on High Performance Computational Science and Engineering, Vol. 9, No. 2, pp.309-322, 2006.
11. T. Xie, X. Qin, “A Security Middleware Model for Real-time Applications on Grids,” IEICE Transactions on Information and Systems, Special Issue on Parallel/Distributed Computing and Networking, Vol.E89-D, No.2, pp.631-638, February 2006.
12. T. Xie, X. Qin, "A Security-Oriented Task Scheduler for Heterogeneous Distributed Systems," The 13th Annual IEEE International Conference on High Performance Computing (HiPC 2006), Bangalore, India, December 18-21, 2006.
13. T. Xie, X. Qin, "SHARP: A New Real-Time Scheduling Algorithm to Improve Security of Parallel Applications on Heterogeneous Clusters," The 25th IEEE International Performance Computing and Communications Conference (IPCCC 2006), April 10-12, 2006, Phoenix, Arizona, USA.
14. T. Xie, X. Qin, A. Sung, "An Approach to Satisfying Security Needs of Periodic Tasks in High Performance Embedded Systems," The 12th Annual IEEE International Conference on High Performance Computing (HiPC 2005, Poster Session), December 18-21, Goa, India.
15. T. Xie, X. Qin, "Incorporating Security into Real-Time Scheduling for Parallel Jobs on Clusters," The 26th IEEE Real-Time Systems Symposium (RTSS 2005, Work-in-Progress Session), December 5-8, 2005, Miami, Florida, USA.
16. T. Xie, X. Qin, “A New Allocation Scheme for Parallel Applications with Deadline and Security Constraints on Clusters,” The 2005 IEEE International Conference on Cluster Computing (Cluster 2005), September 27-30, Boston, Massachusetts, USA.
17. T. Xie, X. Qin, “Towards a Security Service Integration Framework for Distributed Real-Time Systems,” The 18th International Conference on Parallel and Distributed Computing Systems (PDCS 2005, ISCA), Las Vegas, NV, USA, September 12-14, 2005.
18. T. Xie, X. Qin, "Enhancing Security of Real-Time Applications on Grids through Dynamic Scheduling," Proceedings of the 11th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP'05), PP.146-158, Cambridge, MA, June 19, 2005.
19. T. Xie, X. Qin, and A. Sung, "SAREC: A Security-Aware Scheduling Strategy for Real-Time Applications on Clusters," Proceedings of the 34th International Conference on Parallel Processing (ICPP 2005), PP.5-12, Norway, June 14-17, 2005.
20. T. Xie, A. Sung, and X. Qin, "Dynamic Task Scheduling with Security Awareness in Real-Time Systems", Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS'05), the 4th Int'l Workshop on Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems, IEEE/ACM, April 4-8, 2005.
21. T. Xie, X. Qin, and A. Sung, "Integrating Security Requirements into Scheduling for Real-Time Applications in Grid Computing," Proceedings of the International Conference on Grid Computing and Applications, PP.24-30, June 20-23, 2005.