CERN School of Computing 1997
Outlines of lectures and tutorials
GEANT4 is a large project which uses modern methods of programming, namely object oriented methods and the C++ language. It is probably the largest project of its kind currently in progress in particle physics today. The developers are almost all particle physicists, or programmers with a particle physics background, who have had to learn these new techniques "on the job" - not only new programming methods, but also new methods of code management. This could have been (could be?) a recipe for disaster. How has it worked? What problems were encountered? How did we design the structure of the program? Does it work?
By the time of the School, GEANT4 will have issued an alpha (at-your-own-risk) release, and undergone a Review by the LHC Committee Review Board. We should have a good idea of its successes and shortcomings.
- "GEANT4: an Object-Oriented Toolkit for Simulation in HEP", CERN/LHCC/95-70, LRCB Status Report RD44, October 1995
- Bjarne Stroustrup, The C++ Programming Language (2nd Edition), Addison-Wesley, 1991
- Bjarne Stroustrup, The Design and Evolution of C++, Addison-Wesley, 1994
- Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides, "Design Patterns", Addison-Wesley, 1995
- Introduction to OO databases
- comparison with relational and object-relational databases
- OODBMS principals and standards (ODMG-93).
- Persistent objects in LHC era
- transient and persistent objects
- data rates and amounts
- Objectivity/DB architecture
- Data model
- transient and persistent data model
- example: ALICE raw data model and mathematical event generator
- BaBar data model
- Mass storage system: interface between MSS and OODBMS
Application of the STL to Reconstruction of High-energy Physics Data
An extremely important consideration in designing analysis and reconstruction code is the use of efficient data structures. The basic theme of these lectures will be application of techniques based on the Standard Template Library (STL) to problems encountered in reconstruction of data from detectors. Objective: become familiar enough with the STL to apply it to data-handling problems. Assumptions: Laboratory computers with C++ compilers supporting STL (preferably Microsoft VC++ 4.2 or later). Students have some familiarity with C++ and object-oriented design.
- STL basics: containers, iterators
- Advanced STL: adapters
For various detector types and geometries, design appropriate containers to apply some simple reconstruction algorithms. The lecturer will provide a framework allowing simple control and visualisation.
- Standard Template Library: A Definitive Approach to C++ Programming Using STL, by P. J. Plauger, Alexander A. Stepanov, Meng Lee, ISBN: 013437633
- The definitive reference, by the authors. Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley Professional Computing) by Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides, ISBN: 020163361. A supplemental reference that everyone coding OO should know about.
Human Aspects of Computing in Large Physics Collaborations
In the last several years, dramatic progress and profound changes occurred in the use of computers. Far from being limited to "computing", i.e. "number-crunching", computers are used for ever-broadening variety of purposes: communication, documentation, visualisation. The enormously increased power of desktop workstations practically eliminated the whole category of "mainframes". When a sufficiently large number of computers became connected together, the initial communication and data transfer tools (EMAIL, TELNET, FTP etc.) culminated in the invention and almost explosive growth of the World-Wide-Web. At the same time, software programming methodologies were significantly enriched by the concept of Object-Orientation, and available tools range from Computer-Assisted Software Engineering to various structured techniques.
Many of these changes originated in large organisations, with critical needs for communication, documentation and coherence within a large and heterogeneous group of scientists, engineers and computer programmers. It is not by accident that the WorldWideWeb was invented at CERN where there are severe demands on all aspects of data and information processing and sharing. However, in spite of the phenomenal success of WWW, the original promise of fundamental improvement in the effectiveness in large scientific collaborations remains largely unfulfilled. It appears that all the new hardware and software tools are just that - tools, and to fully realise their potential, we must study the human tendencies and attitudes which must be understood and modified before real progress can occur. In these lectures, we will discuss the link between human interactions in large communities on the one hand, and the modern, state-of-the-art computing and communications tools on the other hand. Some specific aspects covered include:
- similarities and differences between the environment at the large Physics Collaborations on the one hand, and the large commercial organisations on the other hand.
- human aspects of various 'paradigm shifts' which are, or will be, occurring in Physics Software Engineering
- problems specific to the choice of the programming language
- problems specific to scientific visualisation
- database issues, and data and information sharing in general
- limiting factors in the use of WWW for its original purpose.
Focusing this study on physics is timely and appropriate. There seems to be growing realisation that in addition to purely technological progress, there has to be an increased emphasis on the human aspects of our technological endeavours. The present day field of High Energy Physics is an excellent candidate for a study of these issues: the geographically distributed nature of the collaborations, which is also mirrored by current trends in other "Big Science" projects and in industry, necessitates an increased attention to the human factors involved.
- T. Kuhn, "The Structure of Scientific Revolutions", Univ. Chicago Press 1970
- G. Weinberg, "The Psychology of Computer Programming", V.Nostrand, Reinhold 1971
- F. Brooks, "The Mythical Man-Month", Addison-Wesley 1975
- T. DeMarco and T. Lister, "Peopleware", Dorset House 1987
- S.M. Davis, "Future Perfect", Addison-Wesley 1987
- E. Yourdon, "Decline and Fall of the American Programmer", Yourdon Press 1992
Information Systems for Physics Experiments
M. Donszelmann and B. Rousseau
- Information Systems
Definitions and generalities
- HEP Application Domains
- A tour of the various HEP domains in terms of information flow and information properties. What kind of information system is required in each domain?
- Overview of the technology, with examples of applications in HEP
- A presentation of the fast evolving Internet technology: e.g. WWW, new HTML features, Java, CORBA, databases, HTML, converters, WWW authoring tools, architectures, groupware, etc. The interest of each technique or tool is explained in the context of HEP application domains
- Specific Applications
- WIRED: WWW Interactive Remote Event Display
- CEDAR: CERN EDMS for Detectors and AcceleratoR
- LIGHT: LIfe cycle Global HyperText
If you know C++ or an Object Oriented language and want to learn Java:
If you are a true beginner in the OO field:
- David Flanagan, "Java in a Nutshell", O'Reilly and Associates, 1996, ISBN: 1-56592-183-6
- Campione, Walrath, "The Java Tutorial: Object-Oriented Programming for the Internet", Corporate and Professional Publishing Group, 1996, ISBN: 0-201-63454-6
and for making Web sites:
- Polonsky, Lehto, "Introduing Microsoft Frontpage 97", Microsoft Press, 1997, ISBN: 1-5723-1571-7
Making Links in Unstructured Data: an Introduction to Hypermedia
- The early visionaries: Bush, Engelbart and Nelson
- Pioneering systems
- Second generation hypermedia systems
- Hypermedia meets the personal computer
- Present day issues:
- Interchange standards
- Hypermedia on the Internet: the Web
- Open hypermedia systems and link services
- Hypermedia and Digital Libraries
- Case Study: the Microcosm open hypermedia system
Making Links in Structured Data: an Introduction to Databases
- Relational databases
- Meta data (schemas), views
- Programming operations
- APIs, stored procedures, database clients
- Database applications
- Security, transactions, data integrity
- Making links using database schemas - the issues
- Primary and secondary (foreign) keys, integrity constraints, relational
- normal forms, indexing
- Active databases, federated databases, multimedia and databases
- Commercial products and their typical uses
Making Links in Web Database Applications
- Web and Java basics (URL, HTML, HTTP, CGI, NSAPI, Java)
- Making links in HTML - the issues
- Accessing databases from the Web
- Server side gateways
- Java clients and gateways including JDBC
- Integrated HTTP support
- Web database applications
- Techniques for session management
- Comparing caching: in the Web, in databases
- Comparing making links: in HTML, in databases, in database applications
Making Links in the Future
- Hyperbases and link services for the Web
- Content-based retrieval and navigation
- making links for non-text data
- Making links intelligently
- concept authoring, intelligent filters
- Putting it all together
- Distributed information management
- Interface issues
- Agents that make links
- Rethinking Hypermedia: the Microcosm Approach, Hall, Davis & Hutchings, Kluwer Press 1996, ISBN: 0-7923-9679-0
- Web Gateway Tools, Cheng & Malaika, John Wiley 1997, ISBN: 0471-17555-2
- Web Server Technology, Nancy J. Yeager and Robert E. McGrath, Morgan Kaufmann, 1996, ISBN: 1-55860-376-X
- Database, Principles, Programming Performance, Patrick O'Neill, Morgan Kaufmann, 1994, ISBN: 1-55860-219-4
- The Java Language Specification, Ken Arnold and James Gosling, Addison Wesley, ISBN: 0-201-63451-1
- Principles of Transaction Processing, Philip Bernstein and Eric Newcomer, Morgan Kaufmann, 1996, ISBN: 1-55860-415-4
Visualisation of Multidimensional and Multivariate Data
This talk will review methods for visualisation of multidimensional and multivariate data, covering techniques such as scatter plots, Chernoff faces, Andrews plots and parallel coordinates. Further examples of scalar, vector and tensor fields will be shown using fluid flow as a case study.
Systems and Architectures for Visualisation
This talk will review a number of general purpose visualisation systems, such as AVS/Express, Explorer and IBM data explorer. A number of exemplars will be presented and compared. The final part of the presentation will assess the architecture of these systems for use in a distributed and collaborative working environment.
- Earnshaw & Wiseman: An Introductory Guide to Scientific Visualization, Springer-Verlag, 1992, ISBN: 3-540-54664-2
- Brodlie et al: Scientific Visualization: Techniques & Applications, Springer-Verlag, 1992, ISBN: 3-540-54565-4
- Tufte: The Visual Display of Quantitative Information, Graphics Press, Box 430, Cheshire, Connecticut 06410, USA, 1987
- Visualization 1: Graphical Communication: http://info.mcc.ac.uk/MVC/ITTI
- Neilson et al (eds): Visualization in Scientific Computing, IEEE Computer Society Press, 1990, ISBN: 0-8186-8979-X (especially the article byHaber & McNabb: Visualization Idioms: A Conceptual Model for Scientific Visualization Systems).
- Specific Systems
- It would be helpful if students had experienced one of: AVS/Express (http://www.avs.com), IRIS Explorer (http://www.nag.co.uk/Welcome_IEC.html) or IBM Data Explorer (http://www-i.almaden.ibm.com/dx)
- There are many books in the library about scientific visualization. Some specific sources are:
- Keller & Keller: Visual Clues, IEEE Computer Society, 1992, ISBN: 0-8186-3102-3
- Proceedings of various Eurographics Workshops on Visualization, published by Springer-Verlag
- Proceeding of Eurographics Annual Conference, published by North Holland, but more recently in Computer Graphics Forum, published by Blackwells.
- Proceedings Visualization 9x conferences, published by IEEE Computer Society
- For a list of resources on WWW start at: http://info.mcc.ac.uk/MVC/MVC-othersites.html
- For a list of resources on WWW start at: http://info.mcc.ac
LHC Trigger Design
Trigger design and trigger architectures will be discussed in the context of the LHC experiments. These lectures will present a "top-down" analysis of the LHC trigger requirements and design, based on the physics requirements of the LHC experiments. The LHC Level-1 trigger algorithms, based on specific trigger hardware, will be described and compared. Higher-level trigger algorithms, based on commercial switching networks and processor farms, will be presented, as well as the expected algorithm execution times. Full trigger menus and expected trigger rates will also be presented. Trigger architectures and implementations under consideration for the LHC experiments will be compared, first using very simple "paper models", then using complete modelling based on fully simulated events.
- Trigger design issues and trigger architectures
Trigger design depends on the data volumes and event topologies expected at the LHC. The frontend readout should be designed to facilitate trigger implementation. This lecture will discuss event buffers, switching networks, processor farms, and supervisors required for different trigger strategies. Data transfers bandwidths will be discussed, including the possible use of regions-of-interest and pre-processing. Interfaces from the data buffers to the switches and from the switches to the processor farms will also be discussed.
- Physics requirements for LHC triggers
The first step in determining a trigger strategy is to review the physics requirements of the system. This lecture is not meant as a "physics" lecture. The objective is to review the expected physics channels to determine which trigger algorithms are needed at Level 1 and at the higher trigger levels. The catalogue of physics processes would include Higgs decays, SUSY particles, gauge bosons, heavy vector bosons, top quarks, and B physics. Inclusive triggers would also be considered in order to satisfy the requirements of unexpected new physics.
- Trigger algorithms and rates
This lecture will describe the trigger algorithms foreseen for the LHC experiments. Level-1 trigger rates (muon, electron/gamma, hadron, jet, and missing-Et) will be presented, as a function of threshold, at low luminosity (10^33) and at high luminosity (10^34). Higher-level trigger algorithms will be described, together with the data required for each algorithm, the trigger rate expected, and an estimate of the algorithm execution time. The LHC trigger strategies will be compared to the strategies followed by the collider experiments at the Fermilab Tevatron.
- Trigger menus
Full trigger menus, based on the physics requirements, will be presented for luminosities 10^33, 3x10^33, and 10^34 /cm2/s. These "sample" trigger menus are meant as "existence proofs"; the final allocation of trigger bandwidth will be made just before data taking begins. Estimated rates will be given for each of the trigger items. Options for the boundary between level-2 processing and level-3 processing will be discussed, especially in what concerns B physics, B-jet tags, and missing-Et. There will be a short discussion of the possible use of neural networks in the LHC triggers
- Trigger modelling
Trigger modelling can be used to determine the influence of different trigger strategies on physics performance and on cost. This lecture will describe a "paper model" technique which uses full trigger menus, but takes (estimated) average values for parameters such as data transfer volumes and rates, algorithm execution times, and processing overheads. The "paper model" results can be used to guide the full modelling studies and switching-network emulation (using MACRAME). Sequential and parallel processing schemes will be compared, as well as single-farm and multiple-farm architectures.
- ATLAS Technical Proposal, CERN/LHCC/94-43
- Chapter 1: Introduction and Overview
- Chapter 5: Trigger, DAQ, and Computing
- Chapter 11: Physics.
- J. Bystricky, et al., "ATLAS Trigger Menus at Luminosity 10^33 /cm2/s". ATLAS Internal Note DAQ-NO-54.
- J. Bystricky, et al., "A Model for Sequential Processing in the ATLAS LVL2/LVL3 Trigger". ATLAS Internal Note DAQ-NO-55.
Network-based Remote Instrument and Experiment Control
W. E. Johnston
This set of lectures will cover some of the basic computing technology and current issues for using the Internet for collaborative remote instrument and experiment control. A set of related case studies reflecting experience in this area will be part of the presentation.
- Software and hardware architectures for experiment control
- network latency hiding as an architectural issue
- implementing adaptive, dynamic experiment protocols
- remote control and access arbitration
- case study: dynamic in-situ microscopy experiments
- Internet-based multimedia conferencing
- audio and video issues
- rate adaptive distribution
- format and protocol gateways
- the Mbone tools and their evolution
- a data source?
- Managing high-speed distributed data flows (40 min) - network-based storage as an approach to scalability
- reliable multicast to collaborators (a la Mbone?)
- case study: on-line collection, processing, and cataloguing of multi-megabit data streams
- On-line experiment notebooks
- a tool for organising everything?
- an object repository?
- cross platform operation
- integration with the Web.
- The importance of detailed throughput monitoring
- precision timestamping and monitoring
- monitoring and performance data analysis
- institutional issues
- deployment issues
- capabilities (access control, integrity, confidentially, resource brokering)
- a case study: confidentiality and access control for distributed (medical) data.
Software Process and Quality (Organisational aspects)
Software Process and Quality
The production of software is a labour intensive activity. This is certainly the case in the field of Particle Physics, given the scale of the software projects that are related to current and future experiments.
For most of the scientists and engineers involved in software production, the business is science or engineering, not computing. As software scope continues to grow so does the feeling that its development and maintenance are out of control. The situation is made even worse by a lack of software engineers and from an uneven software culture.
To be able to control the production of software it is essential to improve (a) the knowledge of the PEOPLE involved, (b) the organisation and improvement of the software development PROCESS (SPI) and (c) the TECHNOLOGY used in the various aspects of this activity. The goal is better systems at lower cost, and happier users of the software.
After having put in perspective the three aspects of software production (people, process, technology) we will look in depth at one special aspect of the technology which is software metrics as a component of software quality improvement with a tool called Logiscope. This close-up of the tool will also include live demonstrations.
Software Metrics Laboratory Exercises:
In the hands-on sessions, students will practice with Logiscope, a tool for software metrics, on a SUN Solaris computer. Logiscope is a toolbox for improving programming quality and test coverage. It can analyse more than 80 language variations including C, C++ and Fortran.
Logiscope features include:
The instructors will provide some sample code in C++ as the starting point for the exercises. For the remaining part of the session, students should bring example of code (own or being used) for analysis (further instructions will follow).
- Code quality with support for software metrics computation to assess maintainability, testability and components reusability;
- Test coverage with support for coverage rates on source code branches, procedure calls, instruction blocks etc.;
- Code standards with support for verification of the program against programming rules and customisation of rules to check;
- Graphical reverse engineering.
- ESA Software Engineering Standards, Prentice-Hall, ISBN: 0-13-106568-8
- ESA Software Engineering Guides, Prentice-Hall, ISBN: 0-13-449281-1
- Managing the Software Process, W. S. Humphrey, Addison-Wesley, 1990, ISBN: 0-201-18095-2
- A Discipline for Software Engineering W. S. Humphrey, Addison-Wesley, 1995 ISBN: 0-201-54610-8
- Introduction to the Personal Software Process W. S. Humphrey, Addison-Wesley, 1997 ISBN: 0-201-54809-7
- Quality, productivity and Competitive Position, W.E. Deming, MIT 1982
- Capability Maturity Model for Software, V1.1.(CMU/SEI-93-TR-24), Software Engineering Institute, 1993
- Key Practices of the Capability Maturity Model, V1.1.(CMU/SEI-93-TR-25),Software Engineering Institute, 1993
- Benefits of CMM-Based Software Process Improvement: Initial results (CMU/SEI-94-TR-13) J. Herbsleb, A. Carleton, J. Rozum J. Siegel, D. Zubrow, Software Engineering Institute, 1994
- Software Improvements at an International Company, H. Wohlwend and S. Rosenbaum,Schlumberger Laboratory for Computer Science, Austin, Texas, USA
The LHC Computing Model
Modern Object-Oriented Software Development
- Requirements and systems analysis
- Design and implementation phases
- Maintenance and the product life-cycle
- Analysis via Use-Cases
- Actors and Use-Cases
- Use-Cases for correct and robust interactions
- Interaction diagrams.
- Object Modelling
- Identifying classes, attributes and methods
- Categorising classes
- Associations, contracts and interfaces
- Producing the object model
- Behavioural modelling
- Refining interaction diagrams
- Converting IDs into per-object state automata
- Designing methods.
- Design issues
- Aspects of design: concurrency, user interfaces, persistence
- Object-level design: patterns, standard methods, integrity
- Designing for robustness: debugging and tracing.
- Reuse issues
- Abstracting reusable classes from designs.
Simulations are a widespread method to understand and to design complex systems. They are applied where the complexity of a system inhibits a closed-form description or where the cost of experiments or of prototypes inhibits measurements. Simulations are based on an abstract model of a real system described in terms of objects and their behaviour. In discrete-event simulations the object's behaviour is expressed in terms of state changes which can occur only at discrete events in time. This method is very suitable for computers and a wide variety of programming languages for this purpose are available. As an example of such a language, MODSIM II will be described in some detail. The design of data acquisition systems for future experiments in high energy physics will be given as an example of an application of discrete-event simulations.
The lectures will give answers to the following questions:
- 1st lecture:
- Why simulate?
- What is simulation?
- What is "discrete-event" simulation?
- 2nd lecture:
- How do discrete-event simulations work?
- What tools and languages are available?
- What is MODSIM II?
- 3rd lecture:
- What is a data acquisition system and why does it need simulation?
- How can MODSIM II be used for simulations of data acquisition systems?
- What results can be obtained?
- What has been learned about discrete-event simulations?
- G. S. Fishman, Concepts and Methods in Discrete-event Digital Simulation, Wiley, New York, 1973; - a classic, should be in every university library.
- J. Banks et al., Discrete-event System Simulation, 2nd ed., Prentice Hall, New Jersey, 1996, ISBN 0-13-217449-9; -a modern textbook about discrete-event simulation.
- http://www.cpsc.ucalgary.ca/~gomes/HTML/sim.html; - a list of links, bookmarks, on-line journals etc.
Visualisation in High Energy Physics
Visualisation plays a crucial role in enabling physicists to understand complex multi-dimensional data. With the advent of powerful computing hardware, sophisticated scientific visualisation software has become a standard part of the analysis toolkit of High Energy Physics experiments.
We describe the concepts underlying successful HEP visualisation systems, for both statistical analyses of multi-event data sets and for the display of individual events in the detectors, using examples from current experiments. Future HEP experiments, such as those at the CERN Large Hadron Collider, are considerably more complex than those currently running. We discuss how new computing technologies will facilitate the difficult visualisation tasks of these experiments.
M.Ruggier, J.Turner - Last update: 3 JUN 1997