Panel Session: Sakai Performance Profiling, Testing & Tuning
Session 076
Linda Place, Stuart Sim, Hans Massing
Wednesday
11:00 am-12:00 pm
Room: Rm402
Session Abstract
Sakai Performance Profiling, Testing & Tuning: Discussion of Performance profiling, testing & tuning. Discussion topics: creation of a performance test environment; why a performance profile should be created; how to create a performance profile; how to construct tests; load tools; experience using LoadRunner at UM; how to tune Sakai for performance.
Presentation Materials
- Session leaders are encouraged to post their presentation materials as Attachments to this Page. (See Attachments tab above.)
Additional Information
- Session leaders are also encouraged to appoint a note-taker and post the minutes of their session on a Page (see Add Page link near top-right.)
- Participants and Session Leaders are encouraged to post Comments (see Comment form below) or create additional Pages as needed to facilitate collaboration (see Add Page link near top-right.)
- Child Pages for this session (Added Pages will automatically appear in this list):
Audience notes (add your own...)
Useful info and comments which weren't in the Powerpoint (though a fuller version of the ppt may be uploaded here):
Sun (Stuart Sims): Sun figures out how to make things work on a large scale. Profiling Sakai starting with 2.1. Working towards "Sakai in a box" - appliance that you can slide into a rack. Looking at high availability solutions (e.g. five-nines, 99.999%)
Michigan: Load testing experience - objectives to spread risk, initially deployed lots of servers. Load-tested with various combinations - stopped at 12 app servers, 8 was optimal (some operations faster than with 12). Figures: 37,500 users, 2000-3000 concurrent average, 4700 peak observed. Dell 2650s, 2850s, dual-processor, 4G RAM, 3.0 GHz CPU. Database is a 4 CPU SunFire, 20G RAM - load average about 2, peaks to 12.
Used Mercury Loadrunner (because of existing site license). Tuned to simulate a certain number of hits/hour for an operation. "hits/hour" in this context is user activity: work actions, e.g. assignment submissions, file download, etc. (not URL hits): important to agree on terminology.
Convert observed activity into testing targets, courses preloaded and prepopulated. Sites tool in Gateway is still problematic, so have removed it until the performance issues can be resolved.
Hans: purpose of testing is to exaggerate to the extreme - expose trends not normally seen. UMich dB failover is designed so that service does not fail for more than 10min or 10MB data loss - logs copied to standby environment. It's important to set up a regression plan to restore your testing environment back to an initial state.
Glenn: Sun could help with JVM memory management issues, still something of a black art. App servers reset every 2-3 days. Stuart noted that Sun is working directly with some Sakai sites on these issues, and moving to Java 1.5 will help. Logs with GC details are useful for analysis.