Posted:
Customers occasionally contact Google Cloud Platform Support to ask for help with troubleshooting latency issues in a Google App Engine application. In this post, I'll discuss how I typically isolate the root cause of this type of problem.

I start by creating a dynamic script that only returns a short text string, and then add it to the customer’s App Engine app so that it can be accessed through a known URL. For an example of such a page in Python, see the hello world tutorial.

Then, I run this curl command from a terminal window:

curl -s -o /dev/null -w "@curl-format.txt"


The curl command uses a format file to define its output. Here are contents of  the format file. You need to create and save this file as curl-format.txt before you run curl:


\n
time_namelookup: %{time_namelookup}\n
time_connect: %{time_connect}\n
time_appconnect: %{time_appconnect}\n
time_pretransfer: %{time_pretransfer}\n
time_redirect: %{time_redirect}\n
time_starttransfer: %{time_starttransfer}\n
--------\n
time_total: %{time_total}\n
\n


The output will look something like this, showing latencies in milliseconds:


           time_namelookup:  0.060
              time_connect:  0.098
           time_appconnect:  0.000
          time_pretransfer:  0.099
             time_redirect:  0.000
        time_starttransfer:  0.144
                           ----------
                time_total:  0.144


The value for time_connect generally represents the latency of the client’s connection to the nearest Google datacenter. If this connection is slow, you can troubleshoot further using traceroute to determine which hop on the network causes the delay, as packets traverse your ISP’s network and Google’s production network to reach the Google frontend server.


You can run tests from clients in different geographical locations. Google Cloud Platform will automatically route requests to the closest data center, which will vary based on the client’s location.


If packets reach the Google frontend server with acceptable latency, then you need to troubleshoot the source of latency problems within App Engine’s serving infrastructure or your application code or configuration.


Look at your logs for the corresponding request in the Google Developers Console. It may help to print out the time when you ran the curl command.


The key field is the wall clock time for the request. This value doesn't include time spent between the client and the server that's running your application. You can calculate the time that the request spent within App Engine's serving infrastructure before reaching your application: subtract the time to reach the Google frontend server from the wall clock time.
All App Engine applications are hosted in the United States, unless their app ID is prefixed by e~, which signifies that the application is hosted in Europe. If your client is in a different geographical region from your application, you will see a significant delay as packets traverse Google’s internal network between the Google frontend server and the server running your application. You will see this delay, for example, if your application is in the US and your client is in Europe or Asia. One of the advantages of hosting your application on App Engine is that this latency is usually significantly less than if you used the public Internet to route requests to an application in another region.


Assuming that your client is in the same geographical region as your application, you can expect the App Engine serving infrastructure to add negligible latency.


Here are some additional troubleshooting tips for isolating latency problems:
  • Was the latency caused by the time to start up a new instance of your application? You will see these start-ups flagged as loading requests in the logs. Try running your tests with the default scheduler settings. In most cases, the default scheduler settings will provide an optimal tradeoff between cost and latency. If you make changes to these settings, run load tests to determine the impact. Also consider adding resident instances.
  • Do the logs show high pending time for a slow request? This is the time that your request spends in the queue waiting for an instance to be available. You can usually avoid by reverting to the default scheduler settings. In some cases, you may need to add resident instances.
  • Are you serving a static file or using the Blobstore API to serve the request? Both of these approaches use a serving path that doesn't run any of your application’s code. Run separate tests for latency in these cases. Use Google’s high performance image serving infrastructure to reduce latency.
  • Do slow requests have a large response size, according to the logs? If so, determine whether there is a bandwidth limitation between your client and Google.
  • For consistency during tests, ensure that your requests aren't cached. When running in production, add a Cache-Control HTTP header to your response in order to improve latency.
  • Does your request make API calls? If so, use Appstats to determine the time taken for API calls.
  • Do you see a high value in the CPU milliseconds field in your logs? If so, your request might be CPU-bound.  Using a higher instance class may reduce latency.
  • Are you using HTTPS or a custom domain? Compare latency with HTTP requests to your appspot.com domain to isolate whether the latency is caused by these factors.
  • If you think the slowdown occurs in your code, add application logging to record timing events in your code.


If you have purchased a support package, you can contact Google Cloud Platform's support team for further help. Here is information you should have at hand to help us quickly diagnose latency caused by network issues:


  1. Your IP address. You can get that by looking at the Developers Console logs for a request sent to App Engine.
  2. The URL of your App Engine application.
  3. The IP address to which the domain name from the above URL resolves to.
  4. The output of ping and traceroute from your client to the above IP address.
  5. The output from running the curl command, shown earlier in this blog post. You may want to run this a few times to ensure you have a representative result.
  6. The Developers Console logs for the above request.


If you’d like to explore this topic further, check out our methodology for YouTube video quality and read about Mobile analysis in PageSpeed Insights.

- Posted by John Lowry, Technical Account Manager

Posted:
As we’re hoping you’ve already read, we’ve just made Google Cloud Platform a first-class place to run your production Windows Server workloads. As we’re talking to customers concerned about Windows Server 2003 reaching End of Life (EOL), we’ve heard that migrations of all sorts are top of mind. We thought it would make sense to help you with resources and ideas around this transition that might make things a little easier, and a few that might make things A LOT easier.

Microsoft has a resource that outlines a 4-step approach to moving off of Windows Server 2003. The first two steps outlined on that page provide tools and recommendations for understanding what applications you need to move. Once you know which applications you need to move, Google Compute Engine can speed up - and reduce the cost of - the final two steps: Target and Migrate.

Google Cloud Platform can help you get off of Windows Server 2003 faster and for less money; especially if you’re having to buy gear to have enough room to move. We have some awesome insights based on talking with customers who have been using Windows Server 2008 R2 and 2012 R2 on Compute Engine during our beta that we think you’ll find useful.

Prove it works with quick, pennies-per-minute tests
Windows Server on Google Compute Engine boots fast. It’s easy to launch a Windows Server 2012 R2 instance from the Developers Console and be logged into the desktop in less than 7 minutes. Launching the instance takes about 5 clicks:
Windows Server on Google Compute Engine takes advantage of per-minute billing. You always pay for the first 10 minutes of an instance, and then per-minute after that.

Testing Your Application
You can use fast, pennies-per-minute access to Windows Server in Google datacenters around the world to run your application through the appropriate test suites to validate it works on the new version of Windows Server you’ve chosen.

If your application consists of multiple components, you may consider running integration tests to validate everything works together. If some of those components (Active Directory, for example) run in your data center, VPN allows you to securely connect your Compute Engine and on-prem networks. VPN is billed per-hour; delete it when your tests are complete and stop paying for it.

You may want to test how your application running on a new version of Windows Server performs under load. The Distributed Load Testing solution provides guidance and a reference implementation for generating and evaluating load against your application:

You can also use your own load testing framework to generate load on Compute Engine Preemptible Instances to reduce costs even further. In any case, when you’re done generating load you can terminate the testing infrastructure and stop paying.

User acceptance tests are also simplified with Windows Server on Google Compute Engine. You can provision a server for each tester, allowing tests to be run in parallel. You can allow RDP access to each instance while the tests are being run, and the Chrome Remote Desktop makes it simple for the testers to access their servers.

Moving and storing data
Whether you’re testing or actively migrating, you probably have data that your Windows Server instances on Compute Engine need to access. You might use VPN as described earlier to give Compute Engine instances access to an existing SAN. You could also use the Google Cloud SDK or Developers Console to copy data into Google Cloud Storage. Once your data is in Cloud Storage, your Compute Engine Windows Servers have high-bandwidth, low-latency access to it, and you can use an instance to create and snapshot a persistent disk that can be attached and used by your new VMs.

Migrating
You’ve tested and proven your application works on a supported version of Windows Server on Compute Engine. Now what? You may be tempted to scale up the new servers, migrate all the data, and go live with the update, but it’s worth considering a gradual approach with the ability to rollback in case your tests missed something. You might run the upgraded version on Compute Engine in parallel with the older version on-premises for a period of time to help identify any issues missed in testing. If you do need to roll back to the older version, simply turn off the Compute Engine infrastructure and turn it back on when you’re ready to try again.

Going all in
What if instead of only changing where your app lives, you could change how you run it?  Some customers are using the migration away from Windows Server 2003 as a chance to embrace the cloud more completely. Rather than migrate an old data warehouse, you might consider replacing it with BigQuery. Just think, it’d be the last time you ever do a patch, manage a reboot, or update the operating system on your warehouse, ever! Operating a reliable queue/messaging server can be challenging, while Google Cloud Pub/Sub gives you access to a global, high-performance service where you only pay for what you use. If you’re using IIS to serve static content for your websites, Google Cloud Storage can make things dramatically easier, and substantially less expensive too.

These are just a few of the managed services offered by Google Cloud Platform that we’ve heard from customers make a big difference in TCO for their stack, but if you’re running software circa 2003, it’s probably worth taking a close look at exactly what’s available today.

All of these technologies are available to you today. There’s no commitment, you only pay for what you use, and if you are a new Google Cloud Platform customer, there’s even a $300/60 day free trial  to help you get started. Whether you’re moving a single application from Windows Server 2003 to 2012 R2, or you’re considering a cloud-optimized architecture as part of your move, we hope that easy access to Windows Server environments on Google Cloud Platform helps make your migration easy and leaves you with no surprises.

- Posted by Miles Ward and Evan Brown, Global Head of Solutions and Solutions Architect, Google Cloud Platform