"Running CF in a Shared Hosting Environment"Presentation Transcript
Running CF in a Shared and Dedicated Hosting Environment Tim Nettleton [email_address] “ You can’t say that I didn’t tell ya!” What I wish that I could tell every customer before stuff happens.
Single Site Shared
Single Site Dedicated
Goal: “ To provide a stable and flexible application platform for customers to experience success and grow through profitability toward ownership.”
Tools and Solutions!
All CF Applications
Performance: CF Configurations Limit simultaneous requests: 15 Timeout Requests at 75 seconds Restart on 3 unresponsive requests Restart CFAS on abnormal termination Suppress whitespace Enforce Strict Attribute Validation Missing Template Handler and Default Error Handler are both empty? Tip:Stay away from ?RequestTimeout=1000000
Performance: Caching Settings Approx 2x total .cfm template pool. Trusted enabled for Production/Non-development environments. Client Variable Storage
Default storage to NT Registry and purge at 5 days.
Only RDBMS systems allowed for External Client Storage
DO NOT increase your Application, Session variables beyond ‘acceptable’ limits
Tip: If you don’t use Client variables, don’t make CF track them. Example code that creates unnecessary overhead: <CFAPPLICATION NAME="CF2001" SESSIONMANAGEMENT="YES" CLIENTMANAGEMENT="YES"> Corrected code without Registry interaction: <CFAPPLICATION NAME="CF2001" SESSIONMANAGEMENT="YES" CLIENTMANAGEMENT=“NO"> Performance:
Performance: Logging Settings Log Long Running Templates. They provide an easy way To identify bottlenecks in code and database design. Any templates that typically runs more than 10-15 seconds will most likely lose a user’s attention and result in F5 or Alt+F4.
Performance: Databases and DSNs
All file based databases get a limit of ½ the total available threads
“ Maintain Database Connections” is also Unchecked
RDBMS databases should use a server IP address not HOSTNAME in the Server field
“ Maintain Database Connections” is Checked
Provide a Database name with each DSN ‘unless’ intended otherwise.
Performance: Databases and Code
Use CACHEDWITHIN for common shared queries
Use BLOCKFACTOR for all SELECT queries
Convert CFQUERY s to Stored Procedures
Use CFTRANSACTION (s) around all
INSERT, UPDATE and DELETE CFQUERY tags.
Use CFLOCK s with a TIMEOUT value nested inside CFTRY blocks
Use manual caching in the Application or Session scope for pinning commonly requested or Non-Dynamic SQL.
Run an Index Analyzer or similar tool for the most common queries.
Cache generated content with custom tag sets.
Disable RDS service. (Security)
Performance: Databases and Code
“ SELECT * FROM TABLE” .
“ SELECT INT1,CHAR2,VARCHAR3,NVARCHAR4,BLOB FROM TABLE”
Ordered in Increasing meta data size.
Use CFQUERY TYPE=“QUERY” sparingly.
“ SELECT COLUMN1 FROM TABLE WHERE ID=#ID#”
“ DOMAIN.COM/Report.cfm?ID=2001 DELETE FROM TABLE ”
NEVER use CFINSERT and CFUPDATE .
CFQUERYPARAM, CFPARAM, VAL(), explicit validation or CGI.HTTP_REFERER 1.) “SELECT COLUMN1 FROM TABLE WHERE ID=#VAL(ID)#” 2.) “SELECT …. WHERE ID= <CFQUERYPARAM VALUE="#URL.ID#" CFSQLTYPE="CF_SQL_INTEGER">” 3.) <CFPARAM TYPE=“NUMERIC” NAME=“URL.ID” VALUE=“#URL.ID#”> Performance: Databases and Code Choosing the right database and reworking malfunctioning code can offer the most immediate Performance and Stability gain.
Performance: Code Bottlenecks
Avoid CFEXIT as there is no guarantee that it will ever resolve.
Avoid excessive iterations in CFLOOP and CFOUTPUTs .
CFLOCK all CFHTTP , CFFTP and CFPOP instances as they have a
high probability of external failure.
Be careful not to CFINCLUDE the base template.
Look for a CFERROR page that is prone to errors.
Use timeout values and explicit error handling on all.
Enable and read debugging info in Administrator
Use PERFMON and cfstat.exe (in CFUSIONBIN) for
Scalability: First, choose the right Database. Load Balancing Hardware or Software? Sticky or Not? Why is sticky bad? It binds a particular user to an application server until the session is terminated, thereby the primary goal of load balancing. How can you avoid sticky? Avoid all server specific memory resident variables. Convert to Client variables, cookies or a breed of URL identifiers. Similar to CFID and CFTOKEN sent in a CF URL. Note: Client variables will only take simple data. No structures or queries unless serialized for text storage.
NTFS password protect the Administrator and CFDOCS
or make them only accessible via non-public IP.
Patch your OS and App server like someone is watching!
Get a firewall with IDS system
Port restrictions and local traffic routing
Have your server professionally scanned
You can bet that someone is scanning it right now!
Protect yourself from URL MDAC hacking by validating input before
building dynamic queries
Use CFERROR and CFTRY/CFCATCH to avoid showing an end user any private information
Security: Before After Unicode Hack .CFM, .DBM, .ASP, .ASA, etc.
Stability: Logs, logs and more logs? A thorough examination of the logs with a complete understanding of what goes in ( code ) provides an insight of “What Happened?!?” Hung Threads Long Running Templates Numeric Errors Catastrophic Errors Application Server restarts with Proximity TIP: Run CYCLE.BAT (in CFUSIONBIN) to release an ODBC memory leak.
If you have ever looked in the /cfusion/log/ directory you have probably seen one or more of the many Cold Fusion generated error/information logs. These text files can easily grow to hundreds of MB and contain the best indicators of 'what happened'. As with any other service or application, regular review of system logs should be part of normal administration. Unfortunately, because of their large size and the fact that the data is segmented into so many logs, it is difficult to get a complete picture of performance, problems, and failure. Developers who work on a dedicated server can use the Cold Fusion Administrator to view these logs. This can be accomplished clicking on "Log Files" and then downloading the entire log via a browser. Unfortunately, this is usually not possible given the size of most logs and remote connection speed. For shared developers, the critical information is unavailable due to the nature of the shared environment and security. In most cases, a developer only knows what a site user tells them or what they trap using CFTRY/CFCATCH and CFERROR. Even with these mechanisms in place, the larger picture is unavailable and the majority of performance issues go unnoticed and unattended. COSMOS
Written mainly with Cold Fusion, COSMOS is an integration of ASP, DOS, Perl, ADSI and Call/VoiceXML. It is a remote management platform that leverages the file system, registry, Metabase, service controls, and performance counters.
At current, COSMOS contains over 18 million server events.. Captured within a maximum of 40 seconds, these events include all of the following:
Cold Fusion Application Server Stop/Starts
Long Running Templates
Scheduled task results
There are over 20 reports available to a dedicated client, many of which are also available for shared customers. Below is a listing of them with a brief description of how they impact the development and maintenance cycle.
General Application Error Listing -Application errors are the best view into the progress and developmental completeness of a site. A well-coded site generates no application errors. This listing provides a top down view of the most recent Application errors for all IIS Roots. By clicking on the error message on the right, a popup window displays the error message as displayed to a site visitor. COSMOS
General Missing Template - This applies to all .cfm templates requested by the web server but not found. In most cases, the developer doesn't even know that people are getting " 404 File Not Found " messages. If a search engine indexes your site or a user bookmarks a page, a change in the site causes missed business. The solution is to use the Default Missing Template Handler in Cold Fusion Administrator or to add a CFERROR TYPE="REQUEST" in your site's Application.cfm. COSMOS
Long Running Template Listing -This applies to the processing time for pages that take longer than expected. The determination of how long is too long is configured in the Logging/Settings section of Cold Fusion Administrator. A typical setting is 45 seconds, though anything taking that long would most likely be canceled or ignored by the calling client. In addition, a script running for 45 seconds could help identify a performance bottleneck for the Application Server. COSMOS
Undeliverable CFMAIL Listing - When Cold Fusion is unable to deliver a message, the original template is renamed and filed in the /cfusion/mail/undelivr/ directory. An error message is also written to the Mail.log or Error.log that describes the problem preventing proper delivery. This listing brings those two pieces of information together by clicking on the message at right. The following popup allows a user to correct and resend the message from their server. This function is indispensable for any business that relies on CFMAIL to reliably carry email and cannot accept undeliverable messages. COSMOS
Hung Thread Listing - Probably the greatest indicator of a performance problem. Hung Threads are Cold Fusion's method of alerting us that it was unable to completely process the requested template. This is usually the result of code or database issues. CF4.x and above has an option in the Administrator to have CF " Restart at n unresponsive requests ". Hung Threads directly relate to the operation of the Application server. When the Hung Thread count matches the defined threshold, Cold Fusion reaches a critical point, and will stop/restart itself to avoid excessive down time. Constant examination of Hung Threads is necessary to avoid Application Server failure. COSMOS
Scheduled Task Listing - Most scheduled tasks run completely unnoticed until someone realizes that a critical function has not processed in days. This listing is not much to look at but, under the hood, a huge modification and improvement has been created for the Executive Service. COSMOS can determine if your task started, succeeded, or failed. It will also allow you to define a target string in the page HTML and record the generated content from the target URL to the database. If a scheduled task does not return the defined string, an email containing the content and diagnostics can be generated at the time of failure OR a VoiceXML application can call you with the news. COSMOS
Aggregation and Stratification More commonly called a GROUPING , the next series of graphs were created to help identify the greatest problems quickly. By examining the data based on Time, Date, and IIS Root, we can gather a greater understanding of where faults exist. COSMOS
Application Errors Stratified by Date COSMOS
Time/Error graph - Especially useful in determining if your day is getting better or worse, this graph breaks down the servers errors by 10 minute increments over a selectable date span. This is often used to diagnose a recurring failure point over a multiple day or week period. COSMOS
Long Running Template Aggregation by IIS Root - Similar to the previous Root Aggregations, this has several prominent exceptions. Because a Long Running Page has a value associated with the processing time, I have included a column for the Sum and Average values. Using this display, it is possible to extract the templates most often run beyond acceptable limits, demanding the greatest processing time. This affects performance, though not necessarily a failure, and is a fantastic indicator of templates that need to be addressed Before they become a stability issue. COSMOS
Hung Thread Aggregation by IIS Root - This graph will often tell which application is responsible for killing the server. Over a selectable data span, one can easily see which sites are causing CF to lose resources. COSMOS Hung Threads=Bad Puppies=Good
One Final Look So when did your Application Server last crash and why? Event Chronology - The first view that brings together data from multiple sources. This report provides a chronological view of all Application Errors, Hung Threads, Long Running Templates, and Application server failures. This information threads events based on time in order to provide a trace leading up to a failure. COSMOS
Spectral Analysis - This graph is unique because it rapidly identifies problems that would otherwise slip under the wire. The three colors representing CF stops (red), starts (green) and Hung threads (purple) are graphed relative to a 24-hour time line. COSMOS
What now? Read your errors and understand them. Always look for a better solution:code and database. Find people that can help when you get stuck Never give up Get on all related security mailings
Running CF in a Shared and Dedicated Hosting Environment Tim Nettleton [email_address] “ You can’t say that I didn’t tell ya!”