Outlines of lectures and tutorials

GEANT4 Experience
OO Databases
Application of the STL to Reconstruction of High-energy Physics Data
Human Aspects of Computing in Large Physics Collaborations
Making Links in Unstructured Data: an Introduction to Hypermedia
Information Systems for Physics Experiments
Making Links in the Future
Making Links in Unstructured Data: an Introduction to Hypermedia
Making Links in Structured Data: an Introduction to Databases
Making Links in Web Database Applications
LHC Trigger Design
Network-based Remote Instrument and Experiment Control
Software Process and Quality (Organisational aspects)
Discrete-Event Simulations
The LHC Computing Model
Modern Object-Oriented Software Development
Visualisation of Multidimensional and Multivariate Data
Systems and Architectures for Visualisation
Visualisation in High Energy Physics

GEANT4 Experience

J.Allison

GEANT4 is a large project which uses modern methods of programming, namely object oriented methods and the C++ language. It is probably the largest project of its kind currently in progress in particle physics today. The developers are almost all particle physicists, or programmers with a particle physics background, who have had to learn these new techniques "on the job" - not only new programming methods, but also new methods of code management. This could have been (could be?) a recipe for disaster. How has it worked? What problems were encountered? How did we design the structure of the program? Does it work?

By the time of the School, GEANT4 will have issued an alpha (at-your-own-risk) release, and undergone a Review by the LHC Committee Review Board. We should have a good idea of its successes and shortcomings.

Reading list:

"GEANT4: an Object-Oriented Toolkit for Simulation in HEP", CERN/LHCC/95-70, LRCB Status Report RD44, October 1995
Bjarne Stroustrup, The C++ Programming Language (2nd Edition), Addison-Wesley, 1991
Bjarne Stroustrup, The Design and Evolution of C++, Addison-Wesley, 1994
Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides, "Design Patterns", Addison-Wesley, 1995

OO Databases

P. Binko

Introduction to OO databases
- comparison with relational and object-relational databases
- OODBMS principals and standards (ODMG-93).
Persistent objects in LHC era
- transient and persistent objects
- data rates and amounts
Objectivity/DB architecture
Data model
- transient and persistent data model
- example: ALICE raw data model and mathematical event generator
- BaBar data model
Mass storage system: interface between MSS and OODBMS

Reading list:

RD45 home page
http://wwwcn.cern.ch/asd/cernlib/rd45/index.html
Objectivity home page
http://www.objectivity.com/
Objectivity manuals
Object Database Management Group Home Page
http://www.odmg.org/
R.G.G. Cattel, The Object Database Standard (ODMG-93) Version 2.0, Morgan Kaufmann
C++ information
http://info.desy.de/user/projects/C++.html
B. Stroustrup, The C++ Programming Language, Addison-Wesley, ISBN 0-201-53992-6
Margaret A. Ellis, Bjarne Stroustrup, The Annotated C++ Reference Manual, Addison-Wesley, ISBN 0-201-51459-1

Application of the STL to Reconstruction of High-energy Physics Data

T. Burnett

An extremely important consideration in designing analysis and reconstruction code is the use of efficient data structures. The basic theme of these lectures will be application of techniques based on the Standard Template Library (STL) to problems encountered in reconstruction of data from detectors. Objective: become familiar enough with the STL to apply it to data-handling problems. Assumptions: Laboratory computers with C++ compilers supporting STL (preferably Microsoft VC++ 4.2 or later). Students have some familiarity with C++ and object-oriented design.

Basic Syllabus:

STL basics: containers, iterators
Advanced STL: adapters

Laboratory exercise:

For various detector types and geometries, design appropriate containers to apply some simple reconstruction algorithms. The lecturer will provide a framework allowing simple control and visualisation.

Reading list:

Standard Template Library: A Definitive Approach to C++ Programming Using STL, by P. J. Plauger, Alexander A. Stepanov, Meng Lee, ISBN: 013437633
The definitive reference, by the authors. Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley Professional Computing) by Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides, ISBN: 020163361. A supplemental reference that everyone coding OO should know about.

Human Aspects of Computing in Large Physics Collaborations

V. Chaloupka

In the last several years, dramatic progress and profound changes occurred in the use of computers. Far from being limited to "computing", i.e. "number-crunching", computers are used for ever-broadening variety of purposes: communication, documentation, visualisation. The enormously increased power of desktop workstations practically eliminated the whole category of "mainframes". When a sufficiently large number of computers became connected together, the initial communication and data transfer tools (EMAIL, TELNET, FTP etc.) culminated in the invention and almost explosive growth of the World-Wide-Web. At the same time, software programming methodologies were significantly enriched by the concept of Object-Orientation, and available tools range from Computer-Assisted Software Engineering to various structured techniques.

Many of these changes originated in large organisations, with critical needs for communication, documentation and coherence within a large and heterogeneous group of scientists, engineers and computer programmers. It is not by accident that the WorldWideWeb was invented at CERN where there are severe demands on all aspects of data and information processing and sharing. However, in spite of the phenomenal success of WWW, the original promise of fundamental improvement in the effectiveness in large scientific collaborations remains largely unfulfilled. It appears that all the new hardware and software tools are just that - tools, and to fully realise their potential, we must study the human tendencies and attitudes which must be understood and modified before real progress can occur. In these lectures, we will discuss the link between human interactions in large communities on the one hand, and the modern, state-of-the-art computing and communications tools on the other hand. Some specific aspects covered include:

similarities and differences between the environment at the large Physics Collaborations on the one hand, and the large commercial organisations on the other hand.
human aspects of various 'paradigm shifts' which are, or will be, occurring in Physics Software Engineering
problems specific to the choice of the programming language
problems specific to scientific visualisation
database issues, and data and information sharing in general
limiting factors in the use of WWW for its original purpose.

Focusing this study on physics is timely and appropriate. There seems to be growing realisation that in addition to purely technological progress, there has to be an increased emphasis on the human aspects of our technological endeavours. The present day field of High Energy Physics is an excellent candidate for a study of these issues: the geographically distributed nature of the collaborations, which is also mirrored by current trends in other "Big Science" projects and in industry, necessitates an increased attention to the human factors involved.

Reading list:

T. Kuhn, "The Structure of Scientific Revolutions", Univ. Chicago Press 1970
G. Weinberg, "The Psychology of Computer Programming", V.Nostrand, Reinhold 1971
F. Brooks, "The Mythical Man-Month", Addison-Wesley 1975
T. DeMarco and T. Lister, "Peopleware", Dorset House 1987
S.M. Davis, "Future Perfect", Addison-Wesley 1987
E. Yourdon, "Decline and Fall of the American Programmer", Yourdon Press 1992

Information Systems for Physics Experiments

M. Donszelmann and B. Rousseau

Information Systems Definitions and generalities
HEP Application Domains
- A tour of the various HEP domains in terms of information flow and information properties. What kind of information system is required in each domain?
Overview of the technology, with examples of applications in HEP
- A presentation of the fast evolving Internet technology: e.g. WWW, new HTML features, Java, CORBA, databases, HTML, converters, WWW authoring tools, architectures, groupware, etc. The interest of each technique or tool is explained in the context of HEP application domains
Specific Applications - WIRED: WWW Interactive Remote Event Display
- CEDAR: CERN EDMS for Detectors and AcceleratoR
- LIGHT: LIfe cycle Global HyperText

Exercises

WWW Authoring with MS FrontPage,
Building dynamic HTML pages with Javascript
Using Java + CORBA

Reading list:

If you know C++ or an Object Oriented language and want to learn Java:

David Flanagan, "Java in a Nutshell", O'Reilly and Associates, 1996, ISBN: 1-56592-183-6

If you are a true beginner in the OO field:

Campione, Walrath, "The Java Tutorial: Object-Oriented Programming for the Internet", Corporate and Professional Publishing Group, 1996, ISBN: 0-201-63454-6

For JavaScript:

David Flanagan, "JavaScript: The Definitive Guide, 2nd Edition", O'Reilly and Associates, 1996, ISBN: 1-56592-234-4

and for making Web sites:

Polonsky, Lehto, "Introduing Microsoft Frontpage 97", Microsoft Press, 1997, ISBN: 1-5723-1571-7

Making Links in Unstructured Data: an Introduction to Hypermedia

W. Hall

The early visionaries: Bush, Engelbart and Nelson
Pioneering systems
- Second generation hypermedia systems
- Hypermedia meets the personal computer
Present day issues:
- Interchange standards
- Hypermedia on the Internet: the Web
- Open hypermedia systems and link services
Hypermedia and Digital Libraries
- Case Study: the Microcosm open hypermedia system

Making Links in Structured Data: an Introduction to Databases

S. Malaika

Relational databases
- Meta data (schemas), views
Programming operations
- APIs, stored procedures, database clients
Database applications
- Security, transactions, data integrity
Making links using database schemas - the issues
- Primary and secondary (foreign) keys, integrity constraints, relational
- normal forms, indexing
Active databases, federated databases, multimedia and databases
- (global schemas)
Commercial products and their typical uses

Making Links in Web Database Applications

S. Malaika

Web and Java basics (URL, HTML, HTTP, CGI, NSAPI, Java)
Making links in HTML - the issues
Accessing databases from the Web
- Server side gateways
- Java clients and gateways including JDBC
- Integrated HTTP support
Web database applications
- Security
- Techniques for session management
- Transactions
Comparing caching: in the Web, in databases
Comparing making links: in HTML, in databases, in database applications

Making Links in the Future

W. Hall

Hyperbases and link services for the Web
- HyperWave
- Webcosm
Content-based retrieval and navigation
- making links for non-text data
Making links intelligently
- concept authoring, intelligent filters
Putting it all together
- Distributed information management
- Interface issues
- Agents that make links

Reading list:

Rethinking Hypermedia: the Microcosm Approach, Hall, Davis & Hutchings, Kluwer Press 1996, ISBN: 0-7923-9679-0
Web Gateway Tools, Cheng & Malaika, John Wiley 1997, ISBN: 0471-17555-2
Web Server Technology, Nancy J. Yeager and Robert E. McGrath, Morgan Kaufmann, 1996, ISBN: 1-55860-376-X
Database, Principles, Programming Performance, Patrick O'Neill, Morgan Kaufmann, 1994, ISBN: 1-55860-219-4
The Java Language Specification, Ken Arnold and James Gosling, Addison Wesley, ISBN: 0-201-63451-1
Principles of Transaction Processing, Philip Bernstein and Eric Newcomer, Morgan Kaufmann, 1996, ISBN: 1-55860-415-4

Visualisation of Multidimensional and Multivariate Data

W.T. Hewitt

This talk will review methods for visualisation of multidimensional and multivariate data, covering techniques such as scatter plots, Chernoff faces, Andrews plots and parallel coordinates. Further examples of scalar, vector and tensor fields will be shown using fluid flow as a case study.

Systems and Architectures for Visualisation

W.T. Hewitt

This talk will review a number of general purpose visualisation systems, such as AVS/Express, Explorer and IBM data explorer. A number of exemplars will be presented and compared. The final part of the presentation will assess the architecture of these systems for use in a distributed and collaborative working environment.

Reading list:

Earnshaw & Wiseman: An Introductory Guide to Scientific Visualization, Springer-Verlag, 1992, ISBN: 3-540-54664-2
Brodlie et al: Scientific Visualization: Techniques & Applications, Springer-Verlag, 1992, ISBN: 3-540-54565-4
Tufte: The Visual Display of Quantitative Information, Graphics Press, Box 430, Cheshire, Connecticut 06410, USA, 1987
Visualization 1: Graphical Communication: http://info.mcc.ac.uk/MVC/ITTI
Neilson et al (eds): Visualization in Scientific Computing, IEEE Computer Society Press, 1990, ISBN: 0-8186-8979-X (especially the article byHaber & McNabb: Visualization Idioms: A Conceptual Model for Scientific Visualization Systems).

Specific Systems

It would be helpful if students had experienced one of: AVS/Express (http://www.avs.com), IRIS Explorer (http://www.nag.co.uk/Welcome_IEC.html) or IBM Data Explorer (http://www-i.almaden.ibm.com/dx)
There are many books in the library about scientific visualization. Some specific sources are:
Keller & Keller: Visual Clues, IEEE Computer Society, 1992, ISBN: 0-8186-3102-3
Proceedings of various Eurographics Workshops on Visualization, published by Springer-Verlag
Proceeding of Eurographics Annual Conference, published by North Holland, but more recently in Computer Graphics Forum, published by Blackwells.
Proceedings Visualization 9x conferences, published by IEEE Computer Society
For a list of resources on WWW start at: http://info.mcc.ac.uk/MVC/MVC-othersites.html
For a list of resources on WWW start at: http://info.mcc.ac

LHC Trigger Design

J.R. Hubbard

Trigger design and trigger architectures will be discussed in the context of the LHC experiments. These lectures will present a "top-down" analysis of the LHC trigger requirements and design, based on the physics requirements of the LHC experiments. The LHC Level-1 trigger algorithms, based on specific trigger hardware, will be described and compared. Higher-level trigger algorithms, based on commercial switching networks and processor farms, will be presented, as well as the expected algorithm execution times. Full trigger menus and expected trigger rates will also be presented. Trigger architectures and implementations under consideration for the LHC experiments will be compared, first using very simple "paper models", then using complete modelling based on fully simulated events.

Trigger design issues and trigger architectures Trigger design depends on the data volumes and event topologies expected at the LHC. The frontend readout should be designed to facilitate trigger implementation. This lecture will discuss event buffers, switching networks, processor farms, and supervisors required for different trigger strategies. Data transfers bandwidths will be discussed, including the possible use of regions-of-interest and pre-processing. Interfaces from the data buffers to the switches and from the switches to the processor farms will also be discussed.
Physics requirements for LHC triggers The first step in determining a trigger strategy is to review the physics requirements of the system. This lecture is not meant as a "physics" lecture. The objective is to review the expected physics channels to determine which trigger algorithms are needed at Level 1 and at the higher trigger levels. The catalogue of physics processes would include Higgs decays, SUSY particles, gauge bosons, heavy vector bosons, top quarks, and B physics. Inclusive triggers would also be considered in order to satisfy the requirements of unexpected new physics.
Trigger algorithms and rates This lecture will describe the trigger algorithms foreseen for the LHC experiments. Level-1 trigger rates (muon, electron/gamma, hadron, jet, and missing-Et) will be presented, as a function of threshold, at low luminosity (10^33) and at high luminosity (10^34). Higher-level trigger algorithms will be described, together with the data required for each algorithm, the trigger rate expected, and an estimate of the algorithm execution time. The LHC trigger strategies will be compared to the strategies followed by the collider experiments at the Fermilab Tevatron.
Trigger menus Full trigger menus, based on the physics requirements, will be presented for luminosities 10^33, 3x10^33, and 10^34 /cm2/s. These "sample" trigger menus are meant as "existence proofs"; the final allocation of trigger bandwidth will be made just before data taking begins. Estimated rates will be given for each of the trigger items. Options for the boundary between level-2 processing and level-3 processing will be discussed, especially in what concerns B physics, B-jet tags, and missing-Et. There will be a short discussion of the possible use of neural networks in the LHC triggers
Trigger modelling Trigger modelling can be used to determine the influence of different trigger strategies on physics performance and on cost. This lecture will describe a "paper model" technique which uses full trigger menus, but takes (estimated) average values for parameters such as data transfer volumes and rates, algorithm execution times, and processing overheads. The "paper model" results can be used to guide the full modelling studies and switching-network emulation (using MACRAME). Sequential and parallel processing schemes will be compared, as well as single-farm and multiple-farm architectures.

Reading list:

ATLAS Technical Proposal, CERN/LHCC/94-43
- Chapter 1: Introduction and Overview
- Chapter 5: Trigger, DAQ, and Computing
- Chapter 11: Physics.
J. Bystricky, et al., "ATLAS Trigger Menus at Luminosity 10^33 /cm2/s". ATLAS Internal Note DAQ-NO-54.
J. Bystricky, et al., "A Model for Sequential Processing in the ATLAS LVL2/LVL3 Trigger". ATLAS Internal Note DAQ-NO-55.

Network-based Remote Instrument and Experiment Control

W. E. Johnston

This set of lectures will cover some of the basic computing technology and current issues for using the Internet for collaborative remote instrument and experiment control. A set of related case studies reflecting experience in this area will be part of the presentation.

Lecture 1:

Software and hardware architectures for experiment control
- network latency hiding as an architectural issue
- implementing adaptive, dynamic experiment protocols
- remote control and access arbitration
- case study: dynamic in-situ microscopy experiments
Internet-based multimedia conferencing
- audio and video issues
- rate adaptive distribution
- format and protocol gateways
- the Mbone tools and their evolution
- a data source?

Lecture 2:

Managing high-speed distributed data flows (40 min) - network-based storage as an approach to scalability
- reliable multicast to collaborators (a la Mbone?)
- case study: on-line collection, processing, and cataloguing of multi-megabit data streams
On-line experiment notebooks
- a tool for organising everything?
- an object repository?
- cross platform operation
- integration with the Web.

Lecture 3:

The importance of detailed throughput monitoring
- precision timestamping and monitoring
- monitoring and performance data analysis
Security
- model
- architecture
- infrastructure
- institutional issues
- deployment issues
- capabilities (access control, integrity, confidentially, resource brokering)
- a case study: confidentiality and access control for distributed (medical) data.

Software Process and Quality (Organisational aspects)

A. Khodabandeh

Software Process and Quality

The production of software is a labour intensive activity. This is certainly the case in the field of Particle Physics, given the scale of the software projects that are related to current and future experiments.

For most of the scientists and engineers involved in software production, the business is science or engineering, not computing. As software scope continues to grow so does the feeling that its development and maintenance are out of control. The situation is made even worse by a lack of software engineers and from an uneven software culture.

To be able to control the production of software it is essential to improve (a) the knowledge of the PEOPLE involved, (b) the organisation and improvement of the software development PROCESS (SPI) and (c) the TECHNOLOGY used in the various aspects of this activity. The goal is better systems at lower cost, and happier users of the software.

After having put in perspective the three aspects of software production (people, process, technology) we will look in depth at one special aspect of the technology which is software metrics as a component of software quality improvement with a tool called Logiscope. This close-up of the tool will also include live demonstrations.

Software Metrics Laboratory Exercises:

In the hands-on sessions, students will practice with Logiscope, a tool for software metrics, on a SUN Solaris computer. Logiscope is a toolbox for improving programming quality and test coverage. It can analyse more than 80 language variations including C, C++ and Fortran.

Logiscope features include:

Code quality with support for software metrics computation to assess maintainability, testability and components reusability;
Test coverage with support for coverage rates on source code branches, procedure calls, instruction blocks etc.;
Code standards with support for verification of the program against programming rules and customisation of rules to check;
Graphical reverse engineering.

The instructors will provide some sample code in C++ as the starting point for the exercises. For the remaining part of the session, students should bring example of code (own or being used) for analysis (further instructions will follow).

Reading list:

ESA Software Engineering Standards, Prentice-Hall, ISBN: 0-13-106568-8
ESA Software Engineering Guides, Prentice-Hall, ISBN: 0-13-449281-1
Managing the Software Process, W. S. Humphrey, Addison-Wesley, 1990, ISBN: 0-201-18095-2
A Discipline for Software Engineering W. S. Humphrey, Addison-Wesley, 1995 ISBN: 0-201-54610-8
Introduction to the Personal Software Process W. S. Humphrey, Addison-Wesley, 1997 ISBN: 0-201-54809-7
Quality, productivity and Competitive Position, W.E. Deming, MIT 1982
Capability Maturity Model for Software, V1.1.(CMU/SEI-93-TR-24), Software Engineering Institute, 1993
Key Practices of the Capability Maturity Model, V1.1.(CMU/SEI-93-TR-25),Software Engineering Institute, 1993
Benefits of CMM-Based Software Process Improvement: Initial results (CMU/SEI-94-TR-13) J. Herbsleb, A. Carleton, J. Rozum J. Siegel, D. Zubrow, Software Engineering Institute, 1994
Software Improvements at an International Company, H. Wohlwend and S. Rosenbaum,Schlumberger Laboratory for Computer Science, Austin, Texas, USA

The LHC Computing Model

M. Mazzucato

Reading list:

ATLAS Computing Technical Proposal
http://atlasinfo.cern.ch/Atlas/GROUPS/SOFTWARE/ctp/ctp-work.html
CMS Computing Technical Proposal
http://cmsdoc.cern.ch/ftp/CMG/CTP/index.html
LCB R&D projects
http://wwwinfo.cern.ch/asd/lhc++/index.html

Modern Object-Oriented Software Development

S. Smith

Overview
- Requirements and systems analysis
- Design and implementation phases
- Maintenance and the product life-cycle
- Analysis via Use-Cases
- Actors and Use-Cases
- Use-Cases for correct and robust interactions
- Interaction diagrams.
Object Modelling
- Identifying classes, attributes and methods
- Categorising classes
- Associations, contracts and interfaces
- Producing the object model
- Behavioural modelling
- Refining interaction diagrams
- Converting IDs into per-object state automata
- Designing methods.
Design issues
- Aspects of design: concurrency, user interfaces, persistence
- Object-level design: patterns, standard methods, integrity
- Designing for robustness: debugging and tracing.
Reuse issues
- Abstracting reusable classes from designs.

Discrete-Event Simulations

R. Spiwoks

Simulations are a widespread method to understand and to design complex systems. They are applied where the complexity of a system inhibits a closed-form description or where the cost of experiments or of prototypes inhibits measurements. Simulations are based on an abstract model of a real system described in terms of objects and their behaviour. In discrete-event simulations the object's behaviour is expressed in terms of state changes which can occur only at discrete events in time. This method is very suitable for computers and a wide variety of programming languages for this purpose are available. As an example of such a language, MODSIM II will be described in some detail. The design of data acquisition systems for future experiments in high energy physics will be given as an example of an application of discrete-event simulations.

The lectures will give answers to the following questions:

1st lecture:
- Why simulate?
- What is simulation?
- What is "discrete-event" simulation?

2nd lecture:
- How do discrete-event simulations work?
- What tools and languages are available?
- What is MODSIM II?

3rd lecture:
- What is a data acquisition system and why does it need simulation?
- How can MODSIM II be used for simulations of data acquisition systems?
- What results can be obtained?
- What has been learned about discrete-event simulations?

Reading list:

G. S. Fishman, Concepts and Methods in Discrete-event Digital Simulation, Wiley, New York, 1973; - a classic, should be in every university library.
J. Banks et al., Discrete-event System Simulation, 2nd ed., Prentice Hall, New Jersey, 1996, ISBN 0-13-217449-9; -a modern textbook about discrete-event simulation.
http://www.cpsc.ucalgary.ca/~gomes/HTML/sim.html; - a list of links, bookmarks, on-line journals etc.

Visualisation in High Energy Physics

L. Taylor

Visualisation plays a crucial role in enabling physicists to understand complex multi-dimensional data. With the advent of powerful computing hardware, sophisticated scientific visualisation software has become a standard part of the analysis toolkit of High Energy Physics experiments.

We describe the concepts underlying successful HEP visualisation systems, for both statistical analyses of multi-event data sets and for the display of individual events in the detectors, using examples from current experiments. Future HEP experiments, such as those at the CERN Large Hadron Collider, are considerably more complex than those currently running. We discuss how new computing technologies will facilitate the difficult visualisation tasks of these experiments.

Reading list

http://www.cern.ch/Physics/Workshops/hepvis/hepvis/references.html
It is also accessible from the HEPVIS home page: http://www.cern.ch/Physics/Workshops/hepvis/
Also of interest is: http://www.cern.ch/Physics/Workshops/hepvis/hepvis/products.html which has a pointer from the HEPVIS home page and (for completeness) from the first URL above.

M.Ruggier, J.Turner - Last update: 3 JUN 1997