Search Posts on Binpipe Blog

AWS Certified Machine Learning Specialty

 I must admit that the sense of accomplishment after clearing the "AWS Certified Machine Learning Specialty" exam and the adrenaline rush when you hit the submit button is slightly addictive!

Click to verify!


This one is special because it is my first certification from AWS and being a DevOps Engineer, it would have been easier for me to take the "AWS Certified Solutions Architect" or the "DevOps Engineer" track instead of exploring less familiar terrain of "Machine Learning". However, stepping out of the comfort zone to learn something new made it even more fulfilling.

This post is about the learning path I followed in the run-up to this certification.

Machine Learning in itself is a vast field and this course, kind of scratches the surface and gets you started. 

In summary, you will need to know the following to clear this exam:

  • How to identify the problem (Supervised, Unsupervised, Classification, Regression)

  • How to choose the algorithm (Linear models, CNN, RNN, Tree Ensemble)

  • How to train your model

  • Data preparation & transformation

  • How to use AWS ecosystem to solve the above

Distribution of Questions Asked:

The topic-wise weightage of the questions asked was as follows:

Topic

Weightage

  • Machine Learning

  • Deep Learning

50%

  • AWS SageMaker

25%

  • About AWS Services

25%

  • The total time to complete the exam was 3 hours

  • There were 65 questions asked

Preparing for the exam:

  1. I got started with watching `AWS Tech Talk` and `Deep Dive` videos on Youtube, not just about ML but about related services as well: https://www.youtube.com/channel/UCT-nPlVzJI-ccQXlxjSvJmw

  2. Followed the free training videos and tutorials from AWS (not all of them though): https://aws.amazon.com/training/learning-paths/machine-learning/exam-preparation

  3. ML/DL needs some high school/college level mathematics to be revisited. Basically, Linear Algebra, Probability & Statistics, Multivariable Calculus and Optimization, worked for me.

  4. Data Visualisation using Jupyter notebooks.

  5. Regression and gradient descent.

  6. DL Models - CNN, RNN

  7. Worked on understanding the following concepts-

  1. Supervised, unsupervised and reinforcement learning.

  2. Purpose of training, validation and testing data.

  3. Various ML Algorithms & Model Types-

    1. Logistical Regression

    2. Linear Regression

    3. Support Vector Machines

    4. Decision Trees / Random Forests

    5. K-means Clustering

    6. K-Nearest Neighbours

Once the above concepts are understood go ahead with trying out the following AWS services-


  • SageMaker

  • Rekognition

  • Polly

  • Transcribe

  • Lex

  • Translate

  • Comprehend

  • S3 including how to secure your data

  • Athena including performance

  • Kinesis Firehose and Analytics 

  • Elastic Map Reduce (EMR)

  • AWS Glue

  • QuickSight 


Those of you who regularly use AWS services won't have much of a problem grasping these.


Finally, try practicing a lot of practice exam questions like ones from the link below:


https://www.udemy.com/course/aws-machine-learning-practice-exam/


You should also have a go at the official practice exam before going for the mains. So that was it folks. I am still learning this discipline, and it's all volatile right now. I will feel more confident with ML once I start applying it in some real-world applications. Will write about those experiences as they come by.

My experience speaking at an Amazon Cloud Seminar in Dubai



As of now, I’ve spoken at a number of internal company events and nerd-fests in educational institutions. Most of these have been strictly technical demonstrations of DevOps and Automation projects or purely tutorial gigs. By no means I'm an accomplished or well-known tech speaker. However, I have discovered, the introvert in me disappears once I get on stage! This experience of speaking at a large scale AWS event was very fulfilling, so I decided to pen down the experience and what transpired in the event. Before we get into the nitty-gritty, here are some flashes of the presentation I delivered, unfortunately, I do not have a video recorded version nor good photographs.




Well working as a DevOps Manager at STARZPLAY, I am responsible for the performance and scalability across the core building blocks of STARZPLAY Cloud Platform. Even-though we are hosted on multiple cloud platforms, AWS is where the biggest chunk our services reside. So naturally, we have a lot of interaction with AWS Architects and Engineers with whom we work together to build and improve this platform. So it all began with one of such mail interactions where we were invited for an AWS Containers Day Seminar to share our experiences.

And the schedule looked promising!

As I expressed my desire to join the event, I was approached by the AWS folks to deliver a lecture in the event about how we developed the STARZPLAY Cloud Platform and moved from monolith to microservices based architecture, what were the challenges we faced and how using container and ECS we bailed ourselves out of the performance issues to serve millions of active subscribers assuring service reliability. This was a great opportunity to revisit the architecture, think and document the steps we did and how we evolved over time to build the technology stack we have now. I loved the experience preparing the slides and getting ready for the talk, it helped me re-live the experience and remembering the milestones, failures and successes!



Here are some of the slides that I presented. I am not writing the transcript here, that can be a blog post for another day.



And finally, once the talk was delivered, it garnered very good responses from the hundreds of attendees and I saw a lot of forks on my Github repo after this event and a lot of connections requests on my LinkedIn profile. So lots of networking now in the Dubai DevOps circles! 

I was appreciated by the Senior Technical Account Manager of AWS, Middle East and  North Africa, here's a mail in this regard.



And I was given an Amazon Echo dot as a token of appreciation, thanks folks!

Finally, I thank my DevOps Team at STARZPLAY, Dubai without whom I stand nowhere. And I thank my superiors especially, Saleem BhattiFaraz Arshad and the awesome people at AWS, MENA - ZeidKeerthi and Paul. I look forward to more such sessions in the future.



Clustering (K-Mean) MNIST images of handwritten digits on AWS Sagemaker


SageMaker is an AWS-fully managed service that covers the entire workflow of Machine Learning. Using the SageMaker demo of AWS, we illustrate the most important relationships, basics and functional principles.
For our experiment, we use the MNIST dataset as training data . The Modified National Institute of Standards and Technology ( MNIST) database is a very large database of handwritten digits that is commonly used to train various image processing systems. The database is also widely used for machine learning (ML) training and testing. The dataset was created by "remixing" the samples from the original NIST dataset records .The reason for this is that the makers thought the NIST training dataset was not directly suited for machine learning experiments because it came from American Census Bureau staff and the test dataset from American students. In the MNIST database, the NIST's black-and-white images were normalised to a size of 28 x 28 pixels with anti- aliasing and grayscale values.
The MNIST database of handwritten digits currently includes a training set of 50,000 examples and a test set of 10,000 examples, a subset of the NIST dataset. The MNIST data foundation is well-suited for trying out learning techniques and pattern recognition methods to real data with minimal pre-processing and formatting.
In the following experiment, we set out to do the example listed here - https://github.com/prasanjit-/ml_notebooks/blob/master/kmeans_mnist.ipynb


The high level steps are:

- Prepare training data
- Train a model
- Deploy & validate the model
- Use the result for predictions

Refer to the Jupiter Notebook in the Github repo for detailed steps- https://github.com/prasanjit-/ml_notebooks/blob/master/MNISTDemo.ipynb

The below is a summary of steps to create this training model on Sagemaker:
  1. Create an S3 bucket
Create an S3 bucket to hold the following -
a. The model training data
b. Model artifacts (which Amazon SageMaker generates during model training).
2. Create a Notebook instance
Create a Notebook instance by logging onto: https://console.aws.amazon.com/sagemaker/
3. Create a new conda_python3 notebook
Once created, open the notebook instance and you will be directed to Jupyter Server. At this point create a new conda_python3 notebook.
4. Specify the role
Specify the role and S3 bucket as follows:
from sagemaker import get_execution_role
role = get_execution_role()
bucket=’bucket-name’
5. Download the MNIST dataset
Download the MNIST dataset to the notebook’s memory.
The MNIST database of handwritten digits has a training set of 60,000 examples.
%%time
import pickle, gzip, numpy, urllib.request, json
# Load the dataset
urllib.request.urlretrieve(“http://deeplearning.net/data/mnist/mnist.pkl.gz", “mnist.pkl.gz”)
with gzip.open(‘mnist.pkl.gz’, ‘rb’) as f:
train_set, valid_set, test_set = pickle.load(f, encoding=’latin1')
6. Convert to RecordIO Format
For this example Data needs to be converted to RecordIO format — which is a file format for storing a sequence of records. Records are stored as an unsigned variant specifying the length of the data, and then the data itself as a binary blob.
Algorithms can accept input data from one or more channels. For example, an algorithm might have two channels of input data, training_data and validation_data. The configuration for each channel provides the S3 location where the input data is stored. It also provides information about the stored data: the MIME type, compression method, and whether the data is wrapped in RecordIO format.
Depending on the input mode that the algorithm supports, Amazon SageMaker either copies input data files from an S3 bucket to a local directory in the Docker container, or makes it available as input streams.
Manual Transformation is not needed since we are following Amazon SageMaker’s Highlevel Libraries fit method in this example.
7. Create a training job
In this example we will use the Amazon SageMaker KMeans module.
From SageMaker, import KMeans as follows:
data_location = ‘s3://{}/kmeans_highlevel_example/data’.format(bucket)
output_location = ‘s3://{}/kmeans_example/output’.format(bucket)
print(‘training data will be uploaded to: {}’.format(data_location))
print(‘training artifacts will be uploaded to: {}’.format(output_location))
kmeans = KMeans(role=role,
train_instance_count=2,
train_instance_type=’ml.c4.8xlarge’,
output_path=output_location,
k=10,
data_location=data_location)
  • role — The IAM role that Amazon SageMaker can assume to perform tasks on your behalf (for example, reading training results, called model artifacts, from the S3 bucket and writing training results to Amazon S3).
  • output_path — The S3 location where Amazon SageMaker stores the training results.
  • train_instance_count and train_instance_type — The type and number of ML EC2 compute instances to use for model training.
  • k — The number of clusters to create. For more information, see K-Means Hyperparameters.
  • data_location — The S3 location where the high-level library uploads the transformed training data.
8. Start Model Training
%%time
kmeans.fit(kmeans.record_set(train_set[0]))
9. Deploy a Model
Deploying a model is a 3 step process.
  • Create a Model — CreateModel request is used to provide information such as the location of the S3 bucket that contains your model artifacts and the registry path of the image that contains inference code.
  • Create an Endpoint Configuration — CreateEndpointConfig request is used to provide the resource configuration for hosting. This includes the type and number of ML compute instances to launch for deploying the model.
  • Create an Endpoint — CreateEndpoint request is used to create an endpoint. Amazon SageMaker launches the ML compute instances and deploys the model.
The High Level Python Library deploy method provides all these tasks.
%%time
kmeans_predictor = kmeans.deploy(initial_instance_count=1,
instance_type=’ml.m4.xlarge’)
The sagemaker.amazon.kmeans.KMeans instance knows the registry path of the image that contains the k-means inference code, so you don’t need to provide it.
This is a synchronous operation. The method waits until the deployment completes before returning. It returns a kmeans_predictor.
10. Validate the Model
Here we get an inference for the 30th image of a handwritten number in the valid_set dataset.
result = kmeans_predictor.predict(train_set[0][30:31])
print(result)
The result would show the closest cluster and the distance from that cluster.
This video has a complete demonstration of this experiment.

Below is the set of commands that were executed and the results of the execution:



sagemaker-prasanjit-01
In [1]:
from sagemaker import get_execution_role

role = get_execution_role()
bucket = 'sagemaker-ps-01' # Use the name of your s3 bucket here
In [2]:
role
Out[2]:
'arn:aws:iam::779615490104:role/service-role/AmazonSageMaker-ExecutionRole-20191103T150143'
In [3]:
%%time
import pickle, gzip, numpy, urllib.request, json

# Load the dataset
urllib.request.urlretrieve("http://deeplearning.net/data/mnist/mnist.pkl.gz", "mnist.pkl.gz")
with gzip.open('mnist.pkl.gz', 'rb') as f:
    train_set, valid_set, test_set = pickle.load(f, encoding='latin1')
CPU times: user 892 ms, sys: 278 ms, total: 1.17 s
Wall time: 4.6 s
In [6]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (2,10)


def show_digit(img, caption='', subplot=None):
    if subplot == None:
        _, (subplot) = plt.subplots(1,1)
    imgr = img.reshape((28,28))
    subplot.axis('off')
    subplot.imshow(imgr, cmap='gray')
    plt.title(caption)

show_digit(train_set[0][1], 'This is a {}'.format(train_set[1][1]))
In [7]:
from sagemaker import KMeans

data_location = 's3://{}/kmeans_highlevel_example/data'.format(bucket)
output_location = 's3://{}/kmeans_highlevel_example/output'.format(bucket)

print('training data will be uploaded to: {}'.format(data_location))
print('training artifacts will be uploaded to: {}'.format(output_location))

kmeans = KMeans(role=role,
                train_instance_count=2,
                train_instance_type='ml.c4.8xlarge',
                output_path=output_location,
                k=10,
                epochs=100,
                data_location=data_location)
training data will be uploaded to: s3://sagemaker-ps-01/kmeans_highlevel_example/data
training artifacts will be uploaded to: s3://sagemaker-ps-01/kmeans_highlevel_example/output
In [8]:
%%time

kmeans.fit(kmeans.record_set(train_set[0]))
2019-11-03 11:45:01 Starting - Starting the training job...
2019-11-03 11:45:03 Starting - Launching requested ML instances......
2019-11-03 11:46:02 Starting - Preparing the instances for training...
2019-11-03 11:46:43 Downloading - Downloading input data...
2019-11-03 11:47:26 Training - Training image download completed. Training in progress..Docker entrypoint called with argument(s): train
[11/03/2019 11:47:28 INFO 140552810366784] Reading default configuration from /opt/amazon/lib/python2.7/site-packages/algorithm/resources/default-input.json: {u'_enable_profiler': u'false', u'_tuning_objective_metric': u'', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'_kvstore': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'true', u'epochs': u'1', u'init_method': u'random', u'local_lloyd_tol': u'0.0001', u'local_lloyd_max_iter': u'300', u'_disable_wait_to_read': u'false', u'extra_center_factor': u'auto', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'half_life_time_size': u'0', u'_num_slices': u'1'}
[11/03/2019 11:47:28 INFO 140552810366784] Reading provided configuration from /opt/ml/input/config/hyperparameters.json: {u'epochs': u'100', u'feature_dim': u'784', u'k': u'10', u'force_dense': u'True'}
[11/03/2019 11:47:28 INFO 140552810366784] Final configuration: {u'_tuning_objective_metric': u'', u'extra_center_factor': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'True', u'epochs': u'100', u'feature_dim': u'784', u'local_lloyd_tol': u'0.0001', u'_disable_wait_to_read': u'false', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'_enable_profiler': u'false', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'init_method': u'random', u'half_life_time_size': u'0', u'local_lloyd_max_iter': u'300', u'_kvstore': u'auto', u'k': u'10', u'_num_slices': u'1'}
[11/03/2019 11:47:28 WARNING 140552810366784] Loggers have already been setup.
[11/03/2019 11:47:28 INFO 140552810366784] Environment: {'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/dc163b99-1521-4ccb-ad30-92ce3ffc3cce', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'DMLC_PS_ROOT_PORT': '9000', 'DMLC_NUM_WORKER': '2', 'SAGEMAKER_HTTP_PORT': '8080', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'DMLC_PS_ROOT_URI': '10.0.229.182', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'HOME': '/root', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-208-60.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/4055df43-a805-42f4-8085-40b6d8b6ab74', 'DMLC_ROLE': 'worker', 'PWD': '/', 'DMLC_NUM_SERVER': '1', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
Process 1 is a worker.
[11/03/2019 11:47:28 INFO 140552810366784] Using default worker.
[11/03/2019 11:47:28 INFO 140552810366784] Loaded iterator creator application/x-recordio-protobuf for content type ('application/x-recordio-protobuf', '1.0')
[11/03/2019 11:47:28 INFO 140552810366784] Create Store: dist_async
Docker entrypoint called with argument(s): train
[11/03/2019 11:47:29 INFO 140169171593024] Reading default configuration from /opt/amazon/lib/python2.7/site-packages/algorithm/resources/default-input.json: {u'_enable_profiler': u'false', u'_tuning_objective_metric': u'', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'_kvstore': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'true', u'epochs': u'1', u'init_method': u'random', u'local_lloyd_tol': u'0.0001', u'local_lloyd_max_iter': u'300', u'_disable_wait_to_read': u'false', u'extra_center_factor': u'auto', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'half_life_time_size': u'0', u'_num_slices': u'1'}
[11/03/2019 11:47:29 INFO 140169171593024] Reading provided configuration from /opt/ml/input/config/hyperparameters.json: {u'epochs': u'100', u'feature_dim': u'784', u'k': u'10', u'force_dense': u'True'}
[11/03/2019 11:47:29 INFO 140169171593024] Final configuration: {u'_tuning_objective_metric': u'', u'extra_center_factor': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'True', u'epochs': u'100', u'feature_dim': u'784', u'local_lloyd_tol': u'0.0001', u'_disable_wait_to_read': u'false', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'_enable_profiler': u'false', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'init_method': u'random', u'half_life_time_size': u'0', u'local_lloyd_max_iter': u'300', u'_kvstore': u'auto', u'k': u'10', u'_num_slices': u'1'}
[11/03/2019 11:47:29 WARNING 140169171593024] Loggers have already been setup.
[11/03/2019 11:47:29 INFO 140169171593024] Launching parameter server for role scheduler
[11/03/2019 11:47:29 INFO 140169171593024] {'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/86d7c856-2158-4dd0-a0f9-7e34716c8d05', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'SAGEMAKER_HTTP_PORT': '8080', 'HOME': '/root', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-229-182.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/05ea5ad8-333f-415c-981c-e8b507b70f15', 'PWD': '/', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
[11/03/2019 11:47:29 INFO 140169171593024] envs={'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/86d7c856-2158-4dd0-a0f9-7e34716c8d05', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'DMLC_NUM_WORKER': '2', 'DMLC_PS_ROOT_PORT': '9000', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'SAGEMAKER_HTTP_PORT': '8080', 'HOME': '/root', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'DMLC_PS_ROOT_URI': '10.0.229.182', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-229-182.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/05ea5ad8-333f-415c-981c-e8b507b70f15', 'DMLC_ROLE': 'scheduler', 'PWD': '/', 'DMLC_NUM_SERVER': '1', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
[11/03/2019 11:47:29 INFO 140169171593024] Launching parameter server for role server
[11/03/2019 11:47:29 INFO 140169171593024] {'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/86d7c856-2158-4dd0-a0f9-7e34716c8d05', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'SAGEMAKER_HTTP_PORT': '8080', 'HOME': '/root', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-229-182.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/05ea5ad8-333f-415c-981c-e8b507b70f15', 'PWD': '/', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
[11/03/2019 11:47:29 INFO 140169171593024] envs={'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/86d7c856-2158-4dd0-a0f9-7e34716c8d05', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'DMLC_NUM_WORKER': '2', 'DMLC_PS_ROOT_PORT': '9000', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'SAGEMAKER_HTTP_PORT': '8080', 'HOME': '/root', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'DMLC_PS_ROOT_URI': '10.0.229.182', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-229-182.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/05ea5ad8-333f-415c-981c-e8b507b70f15', 'DMLC_ROLE': 'server', 'PWD': '/', 'DMLC_NUM_SERVER': '1', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
[11/03/2019 11:47:29 INFO 140169171593024] Environment: {'ECS_CONTAINER_METADATA_URI': 'http://169.254.170.2/v3/86d7c856-2158-4dd0-a0f9-7e34716c8d05', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION_VERSION': '2', 'DMLC_PS_ROOT_PORT': '9000', 'DMLC_NUM_WORKER': '2', 'SAGEMAKER_HTTP_PORT': '8080', 'PATH': '/opt/amazon/bin:/usr/local/nvidia/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/amazon/bin:/opt/amazon/bin', 'PYTHONUNBUFFERED': 'TRUE', 'CANONICAL_ENVROOT': '/opt/amazon', 'LD_LIBRARY_PATH': '/opt/amazon/lib/python2.7/site-packages/cv2/../../../../lib:/usr/local/nvidia/lib64:/opt/amazon/lib', 'MXNET_KVSTORE_BIGARRAY_BOUND': '400000000', 'LANG': 'en_US.utf8', 'DMLC_INTERFACE': 'eth0', 'SHLVL': '1', 'DMLC_PS_ROOT_URI': '10.0.229.182', 'AWS_REGION': 'eu-west-1', 'NVIDIA_VISIBLE_DEVICES': 'void', 'TRAINING_JOB_NAME': 'kmeans-2019-11-03-11-45-00-997', 'HOME': '/root', 'PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION': 'cpp', 'ENVROOT': '/opt/amazon', 'SAGEMAKER_DATA_PATH': '/opt/ml', 'NVIDIA_DRIVER_CAPABILITIES': 'compute,utility', 'NVIDIA_REQUIRE_CUDA': 'cuda>=9.0', 'OMP_NUM_THREADS': '18', 'HOSTNAME': 'ip-10-0-229-182.eu-west-1.compute.internal', 'AWS_CONTAINER_CREDENTIALS_RELATIVE_URI': '/v2/credentials/05ea5ad8-333f-415c-981c-e8b507b70f15', 'DMLC_ROLE': 'worker', 'PWD': '/', 'DMLC_NUM_SERVER': '1', 'TRAINING_JOB_ARN': 'arn:aws:sagemaker:eu-west-1:779615490104:training-job/kmeans-2019-11-03-11-45-00-997', 'AWS_EXECUTION_ENV': 'AWS_ECS_EC2'}
Process 109 is a shell:scheduler.
Process 118 is a shell:server.
Process 1 is a worker.
[11/03/2019 11:47:29 INFO 140169171593024] Using default worker.
[11/03/2019 11:47:29 INFO 140169171593024] Loaded iterator creator application/x-recordio-protobuf for content type ('application/x-recordio-protobuf', '1.0')
[11/03/2019 11:47:29 INFO 140169171593024] Create Store: dist_async
[11/03/2019 11:47:30 INFO 140552810366784] nvidia-smi took: 0.0252320766449 secs to identify 0 gpus
[11/03/2019 11:47:30 INFO 140552810366784] Number of GPUs being used: 0
[11/03/2019 11:47:30 INFO 140552810366784] Setting up with params: {u'_tuning_objective_metric': u'', u'extra_center_factor': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'True', u'epochs': u'100', u'feature_dim': u'784', u'local_lloyd_tol': u'0.0001', u'_disable_wait_to_read': u'false', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'_enable_profiler': u'false', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'init_method': u'random', u'half_life_time_size': u'0', u'local_lloyd_max_iter': u'300', u'_kvstore': u'auto', u'k': u'10', u'_num_slices': u'1'}
[11/03/2019 11:47:30 INFO 140552810366784] 'extra_center_factor' was set to 'auto', evaluated to 10.
[11/03/2019 11:47:30 INFO 140552810366784] Number of GPUs being used: 0
[11/03/2019 11:47:30 INFO 140552810366784] number of center slices 1
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Number of Batches Since Last Reset": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Number of Records Since Last Reset": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Total Batches Seen": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Total Records Seen": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Max Records Seen Between Resets": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Reset Count": {"count": 1, "max": 0, "sum": 0.0, "min": 0}}, "EndTime": 1572781650.394244, "Dimensions": {"Host": "algo-2", "Meta": "init_train_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781650.394209}

[2019-11-03 11:47:30.417] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 0, "duration": 87, "num_examples": 1, "num_bytes": 15820000}
[2019-11-03 11:47:30.596] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 1, "duration": 178, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140552810366784] #progress_metric: host=algo-2, completed 1 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 6, "sum": 6.0, "min": 6}, "Total Records Seen": {"count": 1, "max": 30000, "sum": 30000.0, "min": 30000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 1, "sum": 1.0, "min": 1}}, "EndTime": 1572781650.596894, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 0}, "StartTime": 1572781650.417194}

[11/03/2019 11:47:30 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=139020.312597 records/second
[11/03/2019 11:47:30 INFO 140169171593024] nvidia-smi took: 0.025279045105 secs to identify 0 gpus
[11/03/2019 11:47:30 INFO 140169171593024] Number of GPUs being used: 0
[11/03/2019 11:47:30 INFO 140169171593024] Setting up with params: {u'_tuning_objective_metric': u'', u'extra_center_factor': u'auto', u'local_lloyd_init_method': u'kmeans++', u'force_dense': u'True', u'epochs': u'100', u'feature_dim': u'784', u'local_lloyd_tol': u'0.0001', u'_disable_wait_to_read': u'false', u'eval_metrics': u'["msd"]', u'_num_kv_servers': u'1', u'mini_batch_size': u'5000', u'_enable_profiler': u'false', u'_num_gpus': u'auto', u'local_lloyd_num_trials': u'auto', u'_log_level': u'info', u'init_method': u'random', u'half_life_time_size': u'0', u'local_lloyd_max_iter': u'300', u'_kvstore': u'auto', u'k': u'10', u'_num_slices': u'1'}
[11/03/2019 11:47:30 INFO 140169171593024] 'extra_center_factor' was set to 'auto', evaluated to 10.
[11/03/2019 11:47:30 INFO 140169171593024] Number of GPUs being used: 0
[11/03/2019 11:47:30 INFO 140169171593024] number of center slices 1
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Number of Batches Since Last Reset": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Number of Records Since Last Reset": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Total Batches Seen": {"count": 1, "max": 1, "sum": 1.0, "min": 1}, "Total Records Seen": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Max Records Seen Between Resets": {"count": 1, "max": 5000, "sum": 5000.0, "min": 5000}, "Reset Count": {"count": 1, "max": 0, "sum": 0.0, "min": 0}}, "EndTime": 1572781650.390149, "Dimensions": {"Host": "algo-1", "Meta": "init_train_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781650.390114}

[2019-11-03 11:47:30.413] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 0, "duration": 88, "num_examples": 1, "num_bytes": 15820000}
[2019-11-03 11:47:30.610] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 1, "duration": 196, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140169171593024] #progress_metric: host=algo-1, completed 1 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 6, "sum": 6.0, "min": 6}, "Total Records Seen": {"count": 1, "max": 30000, "sum": 30000.0, "min": 30000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 1, "sum": 1.0, "min": 1}}, "EndTime": 1572781650.611, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 0}, "StartTime": 1572781650.413488}

[11/03/2019 11:47:30 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=126474.646596 records/second
[2019-11-03 11:47:30.732] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 3, "duration": 120, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140169171593024] #progress_metric: host=algo-1, completed 2 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 11, "sum": 11.0, "min": 11}, "Total Records Seen": {"count": 1, "max": 55000, "sum": 55000.0, "min": 55000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 2, "sum": 2.0, "min": 2}}, "EndTime": 1572781650.732486, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 1}, "StartTime": 1572781650.611256}

[11/03/2019 11:47:30 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=206017.191414 records/second
[2019-11-03 11:47:30.853] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 5, "duration": 120, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140169171593024] #progress_metric: host=algo-1, completed 3 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 16, "sum": 16.0, "min": 16}, "Total Records Seen": {"count": 1, "max": 80000, "sum": 80000.0, "min": 80000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 3, "sum": 3.0, "min": 3}}, "EndTime": 1572781650.854186, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 2}, "StartTime": 1572781650.732736}

[11/03/2019 11:47:30 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=205620.877095 records/second
[2019-11-03 11:47:30.962] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 7, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140169171593024] #progress_metric: host=algo-1, completed 4 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 21, "sum": 21.0, "min": 21}, "Total Records Seen": {"count": 1, "max": 105000, "sum": 105000.0, "min": 105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 4, "sum": 4.0, "min": 4}}, "EndTime": 1572781650.96329, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 3}, "StartTime": 1572781650.856089}

[11/03/2019 11:47:30 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=232952.697479 records/second
[2019-11-03 11:47:31.061] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 9, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 5 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 26, "sum": 26.0, "min": 26}, "Total Records Seen": {"count": 1, "max": 130000, "sum": 130000.0, "min": 130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 5, "sum": 5.0, "min": 5}}, "EndTime": 1572781651.061609, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 4}, "StartTime": 1572781650.963495}

[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=254481.560222 records/second
[2019-11-03 11:47:31.176] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 11, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 6 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 31, "sum": 31.0, "min": 31}, "Total Records Seen": {"count": 1, "max": 155000, "sum": 155000.0, "min": 155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 6, "sum": 6.0, "min": 6}}, "EndTime": 1572781651.177087, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 5}, "StartTime": 1572781651.061859}

[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=216692.257301 records/second
[2019-11-03 11:47:30.711] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 3, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140552810366784] #progress_metric: host=algo-2, completed 2 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 11, "sum": 11.0, "min": 11}, "Total Records Seen": {"count": 1, "max": 55000, "sum": 55000.0, "min": 55000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 2, "sum": 2.0, "min": 2}}, "EndTime": 1572781650.712005, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 1}, "StartTime": 1572781650.597101}

[11/03/2019 11:47:30 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=217345.775486 records/second
[2019-11-03 11:47:30.825] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 5, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140552810366784] #progress_metric: host=algo-2, completed 3 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 16, "sum": 16.0, "min": 16}, "Total Records Seen": {"count": 1, "max": 80000, "sum": 80000.0, "min": 80000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 3, "sum": 3.0, "min": 3}}, "EndTime": 1572781650.826047, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 2}, "StartTime": 1572781650.712255}

[11/03/2019 11:47:30 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=219428.877551 records/second
[2019-11-03 11:47:30.942] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 7, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:30 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:30 INFO 140552810366784] #progress_metric: host=algo-2, completed 4 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 21, "sum": 21.0, "min": 21}, "Total Records Seen": {"count": 1, "max": 105000, "sum": 105000.0, "min": 105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 4, "sum": 4.0, "min": 4}}, "EndTime": 1572781650.943047, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 3}, "StartTime": 1572781650.826549}

[11/03/2019 11:47:30 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=214336.728541 records/second
[2019-11-03 11:47:31.046] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 9, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 5 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 26, "sum": 26.0, "min": 26}, "Total Records Seen": {"count": 1, "max": 130000, "sum": 130000.0, "min": 130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 5, "sum": 5.0, "min": 5}}, "EndTime": 1572781651.046523, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 4}, "StartTime": 1572781650.943299}

[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=241870.421288 records/second
[2019-11-03 11:47:31.143] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 11, "duration": 96, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 6 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 31, "sum": 31.0, "min": 31}, "Total Records Seen": {"count": 1, "max": 155000, "sum": 155000.0, "min": 155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 6, "sum": 6.0, "min": 6}}, "EndTime": 1572781651.144019, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 5}, "StartTime": 1572781651.046998}

[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=257288.957374 records/second
[2019-11-03 11:47:31.244] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 13, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 7 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 36, "sum": 36.0, "min": 36}, "Total Records Seen": {"count": 1, "max": 180000, "sum": 180000.0, "min": 180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 7, "sum": 7.0, "min": 7}}, "EndTime": 1572781651.244924, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 6}, "StartTime": 1572781651.144272}

[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=248028.100718 records/second
[2019-11-03 11:47:31.344] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 15, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 8 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 41, "sum": 41.0, "min": 41}, "Total Records Seen": {"count": 1, "max": 205000, "sum": 205000.0, "min": 205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 8, "sum": 8.0, "min": 8}}, "EndTime": 1572781651.345334, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 7}, "StartTime": 1572781651.245178}

[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=249264.503124 records/second
[2019-11-03 11:47:31.437] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 17, "duration": 91, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 9 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 46, "sum": 46.0, "min": 46}, "Total Records Seen": {"count": 1, "max": 230000, "sum": 230000.0, "min": 230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 9, "sum": 9.0, "min": 9}}, "EndTime": 1572781651.437796, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 8}, "StartTime": 1572781651.345584}

[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=270515.787328 records/second
[2019-11-03 11:47:31.544] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 19, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 10 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 51, "sum": 51.0, "min": 51}, "Total Records Seen": {"count": 1, "max": 255000, "sum": 255000.0, "min": 255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 10, "sum": 10.0, "min": 10}}, "EndTime": 1572781651.54472, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 9}, "StartTime": 1572781651.438118}

[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=234205.612486 records/second
[2019-11-03 11:47:31.299] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 13, "duration": 120, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 7 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 36, "sum": 36.0, "min": 36}, "Total Records Seen": {"count": 1, "max": 180000, "sum": 180000.0, "min": 180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 7, "sum": 7.0, "min": 7}}, "EndTime": 1572781651.300212, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 6}, "StartTime": 1572781651.179075}

[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=206112.356017 records/second
[2019-11-03 11:47:31.417] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 15, "duration": 117, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 8 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 41, "sum": 41.0, "min": 41}, "Total Records Seen": {"count": 1, "max": 205000, "sum": 205000.0, "min": 205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 8, "sum": 8.0, "min": 8}}, "EndTime": 1572781651.418261, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 7}, "StartTime": 1572781651.300484}

[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=212013.425533 records/second
[2019-11-03 11:47:31.537] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 17, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 9 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 46, "sum": 46.0, "min": 46}, "Total Records Seen": {"count": 1, "max": 230000, "sum": 230000.0, "min": 230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 9, "sum": 9.0, "min": 9}}, "EndTime": 1572781651.537545, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 8}, "StartTime": 1572781651.41851}

[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=209763.02597 records/second
[2019-11-03 11:47:31.659] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 19, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 10 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 51, "sum": 51.0, "min": 51}, "Total Records Seen": {"count": 1, "max": 255000, "sum": 255000.0, "min": 255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 10, "sum": 10.0, "min": 10}}, "EndTime": 1572781651.659652, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 9}, "StartTime": 1572781651.539169}

[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=207255.491823 records/second
[2019-11-03 11:47:31.766] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 21, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 11 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 56, "sum": 56.0, "min": 56}, "Total Records Seen": {"count": 1, "max": 280000, "sum": 280000.0, "min": 280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 11, "sum": 11.0, "min": 11}}, "EndTime": 1572781651.766884, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 10}, "StartTime": 1572781651.66}

[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=233601.411532 records/second
[2019-11-03 11:47:31.880] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 23, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140169171593024] #progress_metric: host=algo-1, completed 12 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 61, "sum": 61.0, "min": 61}, "Total Records Seen": {"count": 1, "max": 305000, "sum": 305000.0, "min": 305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 12, "sum": 12.0, "min": 12}}, "EndTime": 1572781651.882341, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 11}, "StartTime": 1572781651.767134}

[11/03/2019 11:47:31 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=216771.099282 records/second
[2019-11-03 11:47:32.005] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 25, "duration": 122, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 13 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 66, "sum": 66.0, "min": 66}, "Total Records Seen": {"count": 1, "max": 330000, "sum": 330000.0, "min": 330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 13, "sum": 13.0, "min": 13}}, "EndTime": 1572781652.006303, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 12}, "StartTime": 1572781651.882572}

[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=201845.642106 records/second
[2019-11-03 11:47:32.126] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 27, "duration": 120, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 14 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 71, "sum": 71.0, "min": 71}, "Total Records Seen": {"count": 1, "max": 355000, "sum": 355000.0, "min": 355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 14, "sum": 14.0, "min": 14}}, "EndTime": 1572781652.12742, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 13}, "StartTime": 1572781652.006544}

[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=206517.890027 records/second
[2019-11-03 11:47:31.671] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 21, "duration": 126, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 11 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 56, "sum": 56.0, "min": 56}, "Total Records Seen": {"count": 1, "max": 280000, "sum": 280000.0, "min": 280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 11, "sum": 11.0, "min": 11}}, "EndTime": 1572781651.671943, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 10}, "StartTime": 1572781651.544972}

[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=196678.558433 records/second
[2019-11-03 11:47:31.777] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 23, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 12 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 61, "sum": 61.0, "min": 61}, "Total Records Seen": {"count": 1, "max": 305000, "sum": 305000.0, "min": 305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 12, "sum": 12.0, "min": 12}}, "EndTime": 1572781651.778138, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 11}, "StartTime": 1572781651.672195}

[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=235702.324034 records/second
[2019-11-03 11:47:31.885] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 25, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 13 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 66, "sum": 66.0, "min": 66}, "Total Records Seen": {"count": 1, "max": 330000, "sum": 330000.0, "min": 330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 13, "sum": 13.0, "min": 13}}, "EndTime": 1572781651.885934, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 12}, "StartTime": 1572781651.778343}

[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=232080.829543 records/second
[2019-11-03 11:47:31.995] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 27, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:31 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:31 INFO 140552810366784] #progress_metric: host=algo-2, completed 14 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 71, "sum": 71.0, "min": 71}, "Total Records Seen": {"count": 1, "max": 355000, "sum": 355000.0, "min": 355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 14, "sum": 14.0, "min": 14}}, "EndTime": 1572781651.996102, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 13}, "StartTime": 1572781651.887734}

[11/03/2019 11:47:31 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=230376.771092 records/second
[2019-11-03 11:47:32.102] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 29, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 15 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 76, "sum": 76.0, "min": 76}, "Total Records Seen": {"count": 1, "max": 380000, "sum": 380000.0, "min": 380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 15, "sum": 15.0, "min": 15}}, "EndTime": 1572781652.102514, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 14}, "StartTime": 1572781651.998441}

[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=239888.906428 records/second
[2019-11-03 11:47:32.208] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 31, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 16 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 81, "sum": 81.0, "min": 81}, "Total Records Seen": {"count": 1, "max": 405000, "sum": 405000.0, "min": 405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 16, "sum": 16.0, "min": 16}}, "EndTime": 1572781652.209094, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 15}, "StartTime": 1572781652.10273}

[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=234803.482498 records/second
[2019-11-03 11:47:32.323] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 33, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 17 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 86, "sum": 86.0, "min": 86}, "Total Records Seen": {"count": 1, "max": 430000, "sum": 430000.0, "min": 430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 17, "sum": 17.0, "min": 17}}, "EndTime": 1572781652.324047, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 16}, "StartTime": 1572781652.210925}

[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=220762.137353 records/second
[2019-11-03 11:47:32.416] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 35, "duration": 90, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 18 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 91, "sum": 91.0, "min": 91}, "Total Records Seen": {"count": 1, "max": 455000, "sum": 455000.0, "min": 455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 18, "sum": 18.0, "min": 18}}, "EndTime": 1572781652.417163, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 17}, "StartTime": 1572781652.325707}

[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=272922.387384 records/second
[2019-11-03 11:47:32.526] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 37, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 19 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 96, "sum": 96.0, "min": 96}, "Total Records Seen": {"count": 1, "max": 480000, "sum": 480000.0, "min": 480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 19, "sum": 19.0, "min": 19}}, "EndTime": 1572781652.527196, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 18}, "StartTime": 1572781652.417384}

[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=227432.165709 records/second
[2019-11-03 11:47:32.626] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 39, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 20 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 101, "sum": 101.0, "min": 101}, "Total Records Seen": {"count": 1, "max": 505000, "sum": 505000.0, "min": 505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 20, "sum": 20.0, "min": 20}}, "EndTime": 1572781652.62669, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 19}, "StartTime": 1572781652.528697}

[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=254831.607036 records/second
[2019-11-03 11:47:32.248] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 29, "duration": 120, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 15 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 76, "sum": 76.0, "min": 76}, "Total Records Seen": {"count": 1, "max": 380000, "sum": 380000.0, "min": 380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 15, "sum": 15.0, "min": 15}}, "EndTime": 1572781652.249401, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 14}, "StartTime": 1572781652.127715}

[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=205183.516847 records/second
[2019-11-03 11:47:32.377] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 31, "duration": 125, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 16 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 81, "sum": 81.0, "min": 81}, "Total Records Seen": {"count": 1, "max": 405000, "sum": 405000.0, "min": 405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 16, "sum": 16.0, "min": 16}}, "EndTime": 1572781652.378822, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 15}, "StartTime": 1572781652.2497}

[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=193302.289226 records/second
[2019-11-03 11:47:32.496] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 33, "duration": 116, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 17 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 86, "sum": 86.0, "min": 86}, "Total Records Seen": {"count": 1, "max": 430000, "sum": 430000.0, "min": 430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 17, "sum": 17.0, "min": 17}}, "EndTime": 1572781652.496576, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 16}, "StartTime": 1572781652.379179}

[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=212693.332035 records/second
[2019-11-03 11:47:32.615] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 35, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 18 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 91, "sum": 91.0, "min": 91}, "Total Records Seen": {"count": 1, "max": 455000, "sum": 455000.0, "min": 455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 18, "sum": 18.0, "min": 18}}, "EndTime": 1572781652.616174, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 17}, "StartTime": 1572781652.496823}

[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=209236.884482 records/second
[2019-11-03 11:47:32.737] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 37, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 19 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 96, "sum": 96.0, "min": 96}, "Total Records Seen": {"count": 1, "max": 480000, "sum": 480000.0, "min": 480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 19, "sum": 19.0, "min": 19}}, "EndTime": 1572781652.738183, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 18}, "StartTime": 1572781652.616413}

[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=205061.132536 records/second
[2019-11-03 11:47:32.857] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 39, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 20 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 101, "sum": 101.0, "min": 101}, "Total Records Seen": {"count": 1, "max": 505000, "sum": 505000.0, "min": 505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 20, "sum": 20.0, "min": 20}}, "EndTime": 1572781652.858275, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 19}, "StartTime": 1572781652.738444}

[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=208381.972142 records/second
[2019-11-03 11:47:32.966] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 41, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140169171593024] #progress_metric: host=algo-1, completed 21 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 106, "sum": 106.0, "min": 106}, "Total Records Seen": {"count": 1, "max": 530000, "sum": 530000.0, "min": 530000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 21, "sum": 21.0, "min": 21}}, "EndTime": 1572781652.966966, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 20}, "StartTime": 1572781652.858526}

[11/03/2019 11:47:32 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=229766.962848 records/second
[2019-11-03 11:47:33.074] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 43, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 22 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 111, "sum": 111.0, "min": 111}, "Total Records Seen": {"count": 1, "max": 555000, "sum": 555000.0, "min": 555000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 22, "sum": 22.0, "min": 22}}, "EndTime": 1572781653.075226, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 21}, "StartTime": 1572781652.967602}

[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=231852.986895 records/second
[2019-11-03 11:47:33.183] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 45, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 23 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 116, "sum": 116.0, "min": 116}, "Total Records Seen": {"count": 1, "max": 580000, "sum": 580000.0, "min": 580000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 23, "sum": 23.0, "min": 23}}, "EndTime": 1572781653.183966, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 22}, "StartTime": 1572781653.075554}

[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=230295.815882 records/second
[2019-11-03 11:47:32.733] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 41, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 21 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 106, "sum": 106.0, "min": 106}, "Total Records Seen": {"count": 1, "max": 530000, "sum": 530000.0, "min": 530000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 21, "sum": 21.0, "min": 21}}, "EndTime": 1572781652.733428, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 20}, "StartTime": 1572781652.626892}

[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=234418.188728 records/second
[2019-11-03 11:47:32.852] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 43, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 22 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 111, "sum": 111.0, "min": 111}, "Total Records Seen": {"count": 1, "max": 555000, "sum": 555000.0, "min": 555000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 22, "sum": 22.0, "min": 22}}, "EndTime": 1572781652.85269, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 21}, "StartTime": 1572781652.73363}

[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=209777.294079 records/second
[2019-11-03 11:47:32.963] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 45, "duration": 110, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:32 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:32 INFO 140552810366784] #progress_metric: host=algo-2, completed 23 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 116, "sum": 116.0, "min": 116}, "Total Records Seen": {"count": 1, "max": 580000, "sum": 580000.0, "min": 580000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 23, "sum": 23.0, "min": 23}}, "EndTime": 1572781652.963672, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 22}, "StartTime": 1572781652.852899}

[11/03/2019 11:47:32 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=225459.002118 records/second
[2019-11-03 11:47:33.079] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 47, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 24 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 121, "sum": 121.0, "min": 121}, "Total Records Seen": {"count": 1, "max": 605000, "sum": 605000.0, "min": 605000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 24, "sum": 24.0, "min": 24}}, "EndTime": 1572781653.080286, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 23}, "StartTime": 1572781652.963875}

[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=214553.817697 records/second
[2019-11-03 11:47:33.194] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 49, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 25 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 126, "sum": 126.0, "min": 126}, "Total Records Seen": {"count": 1, "max": 630000, "sum": 630000.0, "min": 630000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 25, "sum": 25.0, "min": 25}}, "EndTime": 1572781653.194447, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 24}, "StartTime": 1572781653.080736}

[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=219639.386018 records/second
[2019-11-03 11:47:33.304] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 51, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 26 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 131, "sum": 131.0, "min": 131}, "Total Records Seen": {"count": 1, "max": 655000, "sum": 655000.0, "min": 655000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 26, "sum": 26.0, "min": 26}}, "EndTime": 1572781653.304679, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 25}, "StartTime": 1572781653.194648}

[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=226936.503505 records/second
[2019-11-03 11:47:33.408] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 53, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 27 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 136, "sum": 136.0, "min": 136}, "Total Records Seen": {"count": 1, "max": 680000, "sum": 680000.0, "min": 680000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 27, "sum": 27.0, "min": 27}}, "EndTime": 1572781653.408885, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 26}, "StartTime": 1572781653.305177}

[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=240719.926538 records/second
[2019-11-03 11:47:33.506] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 55, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 28 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 141, "sum": 141.0, "min": 141}, "Total Records Seen": {"count": 1, "max": 705000, "sum": 705000.0, "min": 705000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 28, "sum": 28.0, "min": 28}}, "EndTime": 1572781653.507087, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 27}, "StartTime": 1572781653.409142}

[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=254881.161309 records/second
[2019-11-03 11:47:33.616] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 57, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 29 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 146, "sum": 146.0, "min": 146}, "Total Records Seen": {"count": 1, "max": 730000, "sum": 730000.0, "min": 730000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 29, "sum": 29.0, "min": 29}}, "EndTime": 1572781653.616641, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 28}, "StartTime": 1572781653.507342}

[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=228445.939469 records/second
[2019-11-03 11:47:33.285] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 47, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 24 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 121, "sum": 121.0, "min": 121}, "Total Records Seen": {"count": 1, "max": 605000, "sum": 605000.0, "min": 605000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 24, "sum": 24.0, "min": 24}}, "EndTime": 1572781653.285798, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 23}, "StartTime": 1572781653.184255}

[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=245898.125588 records/second
[2019-11-03 11:47:33.395] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 49, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 25 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 126, "sum": 126.0, "min": 126}, "Total Records Seen": {"count": 1, "max": 630000, "sum": 630000.0, "min": 630000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 25, "sum": 25.0, "min": 25}}, "EndTime": 1572781653.395731, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 24}, "StartTime": 1572781653.287535}

[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=230465.887587 records/second
[2019-11-03 11:47:33.507] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 51, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 26 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 131, "sum": 131.0, "min": 131}, "Total Records Seen": {"count": 1, "max": 655000, "sum": 655000.0, "min": 655000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 26, "sum": 26.0, "min": 26}}, "EndTime": 1572781653.507964, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 25}, "StartTime": 1572781653.396171}

[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=223359.803688 records/second
[2019-11-03 11:47:33.614] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 53, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 27 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 136, "sum": 136.0, "min": 136}, "Total Records Seen": {"count": 1, "max": 680000, "sum": 680000.0, "min": 680000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 27, "sum": 27.0, "min": 27}}, "EndTime": 1572781653.615178, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 26}, "StartTime": 1572781653.508207}

[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=233372.133136 records/second
[2019-11-03 11:47:33.734] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 55, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 28 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 141, "sum": 141.0, "min": 141}, "Total Records Seen": {"count": 1, "max": 705000, "sum": 705000.0, "min": 705000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 28, "sum": 28.0, "min": 28}}, "EndTime": 1572781653.735415, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 27}, "StartTime": 1572781653.615447}

[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=208149.086275 records/second
[2019-11-03 11:47:33.843] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 57, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 29 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 146, "sum": 146.0, "min": 146}, "Total Records Seen": {"count": 1, "max": 730000, "sum": 730000.0, "min": 730000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 29, "sum": 29.0, "min": 29}}, "EndTime": 1572781653.843744, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 28}, "StartTime": 1572781653.737312}

[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=234438.104777 records/second
[2019-11-03 11:47:33.945] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 59, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140169171593024] #progress_metric: host=algo-1, completed 30 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 151, "sum": 151.0, "min": 151}, "Total Records Seen": {"count": 1, "max": 755000, "sum": 755000.0, "min": 755000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 30, "sum": 30.0, "min": 30}}, "EndTime": 1572781653.946883, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 29}, "StartTime": 1572781653.845438}

[11/03/2019 11:47:33 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=246162.514174 records/second
[2019-11-03 11:47:34.066] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 61, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 31 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 156, "sum": 156.0, "min": 156}, "Total Records Seen": {"count": 1, "max": 780000, "sum": 780000.0, "min": 780000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 31, "sum": 31.0, "min": 31}}, "EndTime": 1572781654.067035, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 30}, "StartTime": 1572781653.94709}

[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=208220.592586 records/second
[2019-11-03 11:47:34.171] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 63, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 32 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 161, "sum": 161.0, "min": 161}, "Total Records Seen": {"count": 1, "max": 805000, "sum": 805000.0, "min": 805000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 32, "sum": 32.0, "min": 32}}, "EndTime": 1572781654.171523, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 31}, "StartTime": 1572781654.068661}

[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=242749.526574 records/second
[2019-11-03 11:47:33.717] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 59, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 30 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 151, "sum": 151.0, "min": 151}, "Total Records Seen": {"count": 1, "max": 755000, "sum": 755000.0, "min": 755000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 30, "sum": 30.0, "min": 30}}, "EndTime": 1572781653.71781, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 29}, "StartTime": 1572781653.616888}

[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=247224.029801 records/second
[2019-11-03 11:47:33.821] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 61, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 31 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 156, "sum": 156.0, "min": 156}, "Total Records Seen": {"count": 1, "max": 780000, "sum": 780000.0, "min": 780000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 31, "sum": 31.0, "min": 31}}, "EndTime": 1572781653.821933, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 30}, "StartTime": 1572781653.718127}

[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=240509.56349 records/second
[2019-11-03 11:47:33.916] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 63, "duration": 94, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:33 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:33 INFO 140552810366784] #progress_metric: host=algo-2, completed 32 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 161, "sum": 161.0, "min": 161}, "Total Records Seen": {"count": 1, "max": 805000, "sum": 805000.0, "min": 805000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 32, "sum": 32.0, "min": 32}}, "EndTime": 1572781653.916884, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 31}, "StartTime": 1572781653.822185}

[11/03/2019 11:47:33 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=263612.98335 records/second
[2019-11-03 11:47:34.010] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 65, "duration": 93, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 33 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 166, "sum": 166.0, "min": 166}, "Total Records Seen": {"count": 1, "max": 830000, "sum": 830000.0, "min": 830000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 33, "sum": 33.0, "min": 33}}, "EndTime": 1572781654.011389, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 32}, "StartTime": 1572781653.917124}

[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=264847.430143 records/second
[2019-11-03 11:47:34.105] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 67, "duration": 93, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 34 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 171, "sum": 171.0, "min": 171}, "Total Records Seen": {"count": 1, "max": 855000, "sum": 855000.0, "min": 855000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 34, "sum": 34.0, "min": 34}}, "EndTime": 1572781654.106247, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 33}, "StartTime": 1572781654.011634}

[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=263838.502784 records/second
[2019-11-03 11:47:34.205] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 69, "duration": 98, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 35 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 176, "sum": 176.0, "min": 176}, "Total Records Seen": {"count": 1, "max": 880000, "sum": 880000.0, "min": 880000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 35, "sum": 35.0, "min": 35}}, "EndTime": 1572781654.205973, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 34}, "StartTime": 1572781654.106499}

[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=250973.784295 records/second
[2019-11-03 11:47:34.306] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 71, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 36 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 181, "sum": 181.0, "min": 181}, "Total Records Seen": {"count": 1, "max": 905000, "sum": 905000.0, "min": 905000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 36, "sum": 36.0, "min": 36}}, "EndTime": 1572781654.306714, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 35}, "StartTime": 1572781654.206226}

[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=248451.820189 records/second
[2019-11-03 11:47:34.400] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 73, "duration": 91, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 37 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 186, "sum": 186.0, "min": 186}, "Total Records Seen": {"count": 1, "max": 930000, "sum": 930000.0, "min": 930000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 37, "sum": 37.0, "min": 37}}, "EndTime": 1572781654.400918, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 36}, "StartTime": 1572781654.308464}

[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=269996.163423 records/second
[2019-11-03 11:47:34.509] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 75, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 38 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 191, "sum": 191.0, "min": 191}, "Total Records Seen": {"count": 1, "max": 955000, "sum": 955000.0, "min": 955000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 38, "sum": 38.0, "min": 38}}, "EndTime": 1572781654.50983, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 37}, "StartTime": 1572781654.402811}

[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=233186.856197 records/second
[2019-11-03 11:47:34.606] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 77, "duration": 94, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 39 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 196, "sum": 196.0, "min": 196}, "Total Records Seen": {"count": 1, "max": 980000, "sum": 980000.0, "min": 980000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 39, "sum": 39.0, "min": 39}}, "EndTime": 1572781654.606595, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 38}, "StartTime": 1572781654.511681}

[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=263049.548069 records/second
[2019-11-03 11:47:34.280] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 65, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 33 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 166, "sum": 166.0, "min": 166}, "Total Records Seen": {"count": 1, "max": 830000, "sum": 830000.0, "min": 830000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 33, "sum": 33.0, "min": 33}}, "EndTime": 1572781654.280882, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 32}, "StartTime": 1572781654.171763}

[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=228820.324144 records/second
[2019-11-03 11:47:34.382] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 67, "duration": 101, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 34 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 171, "sum": 171.0, "min": 171}, "Total Records Seen": {"count": 1, "max": 855000, "sum": 855000.0, "min": 855000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 34, "sum": 34.0, "min": 34}}, "EndTime": 1572781654.3829, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 33}, "StartTime": 1572781654.281128}

[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=245078.566704 records/second
[2019-11-03 11:47:34.495] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 69, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 35 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 176, "sum": 176.0, "min": 176}, "Total Records Seen": {"count": 1, "max": 880000, "sum": 880000.0, "min": 880000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 35, "sum": 35.0, "min": 35}}, "EndTime": 1572781654.495952, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 34}, "StartTime": 1572781654.383263}

[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=221599.585786 records/second
[2019-11-03 11:47:34.602] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 71, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 36 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 181, "sum": 181.0, "min": 181}, "Total Records Seen": {"count": 1, "max": 905000, "sum": 905000.0, "min": 905000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 36, "sum": 36.0, "min": 36}}, "EndTime": 1572781654.603045, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 35}, "StartTime": 1572781654.496192}

[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=233684.187068 records/second
[2019-11-03 11:47:34.716] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 73, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 37 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 186, "sum": 186.0, "min": 186}, "Total Records Seen": {"count": 1, "max": 930000, "sum": 930000.0, "min": 930000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 37, "sum": 37.0, "min": 37}}, "EndTime": 1572781654.716717, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 36}, "StartTime": 1572781654.603287}

[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=220101.342133 records/second
[2019-11-03 11:47:34.835] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 75, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 38 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 191, "sum": 191.0, "min": 191}, "Total Records Seen": {"count": 1, "max": 955000, "sum": 955000.0, "min": 955000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 38, "sum": 38.0, "min": 38}}, "EndTime": 1572781654.836292, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 37}, "StartTime": 1572781654.716984}

[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=209318.750287 records/second
[2019-11-03 11:47:34.942] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 77, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140169171593024] #progress_metric: host=algo-1, completed 39 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 196, "sum": 196.0, "min": 196}, "Total Records Seen": {"count": 1, "max": 980000, "sum": 980000.0, "min": 980000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 39, "sum": 39.0, "min": 39}}, "EndTime": 1572781654.942638, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 38}, "StartTime": 1572781654.836537}

[11/03/2019 11:47:34 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=235349.463572 records/second
[2019-11-03 11:47:35.055] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 79, "duration": 110, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 40 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 201, "sum": 201.0, "min": 201}, "Total Records Seen": {"count": 1, "max": 1005000, "sum": 1005000.0, "min": 1005000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 40, "sum": 40.0, "min": 40}}, "EndTime": 1572781655.055605, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 39}, "StartTime": 1572781654.944466}

[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=224686.993098 records/second
[2019-11-03 11:47:35.176] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 81, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 41 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 206, "sum": 206.0, "min": 206}, "Total Records Seen": {"count": 1, "max": 1030000, "sum": 1030000.0, "min": 1030000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 41, "sum": 41.0, "min": 41}}, "EndTime": 1572781655.177454, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 40}, "StartTime": 1572781655.057545}

[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=208179.253468 records/second
[2019-11-03 11:47:34.712] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 79, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 40 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 201, "sum": 201.0, "min": 201}, "Total Records Seen": {"count": 1, "max": 1005000, "sum": 1005000.0, "min": 1005000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 40, "sum": 40.0, "min": 40}}, "EndTime": 1572781654.712592, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 39}, "StartTime": 1572781654.606835}

[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=236098.233163 records/second
[2019-11-03 11:47:34.811] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 81, "duration": 98, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 41 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 206, "sum": 206.0, "min": 206}, "Total Records Seen": {"count": 1, "max": 1030000, "sum": 1030000.0, "min": 1030000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 41, "sum": 41.0, "min": 41}}, "EndTime": 1572781654.811942, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 40}, "StartTime": 1572781654.712832}

[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=251942.229281 records/second
[2019-11-03 11:47:34.908] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 83, "duration": 96, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:34 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:34 INFO 140552810366784] #progress_metric: host=algo-2, completed 42 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 211, "sum": 211.0, "min": 211}, "Total Records Seen": {"count": 1, "max": 1055000, "sum": 1055000.0, "min": 1055000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 42, "sum": 42.0, "min": 42}}, "EndTime": 1572781654.909221, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 41}, "StartTime": 1572781654.812146}

[11/03/2019 11:47:34 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=257259.920411 records/second
[2019-11-03 11:47:35.013] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 85, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 43 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 216, "sum": 216.0, "min": 216}, "Total Records Seen": {"count": 1, "max": 1080000, "sum": 1080000.0, "min": 1080000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 43, "sum": 43.0, "min": 43}}, "EndTime": 1572781655.014094, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 42}, "StartTime": 1572781654.909413}

[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=238581.673887 records/second
[2019-11-03 11:47:35.110] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 87, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 44 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 221, "sum": 221.0, "min": 221}, "Total Records Seen": {"count": 1, "max": 1105000, "sum": 1105000.0, "min": 1105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 44, "sum": 44.0, "min": 44}}, "EndTime": 1572781655.110599, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 43}, "StartTime": 1572781655.01429}

[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=259303.973233 records/second
[2019-11-03 11:47:35.203] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 89, "duration": 92, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 45 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 226, "sum": 226.0, "min": 226}, "Total Records Seen": {"count": 1, "max": 1130000, "sum": 1130000.0, "min": 1130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 45, "sum": 45.0, "min": 45}}, "EndTime": 1572781655.203846, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 44}, "StartTime": 1572781655.11079}

[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=268355.078287 records/second
[2019-11-03 11:47:35.301] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 91, "duration": 96, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 46 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 231, "sum": 231.0, "min": 231}, "Total Records Seen": {"count": 1, "max": 1155000, "sum": 1155000.0, "min": 1155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 46, "sum": 46.0, "min": 46}}, "EndTime": 1572781655.301512, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 45}, "StartTime": 1572781655.204047}

[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=256218.311012 records/second
[2019-11-03 11:47:35.402] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 93, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 47 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 236, "sum": 236.0, "min": 236}, "Total Records Seen": {"count": 1, "max": 1180000, "sum": 1180000.0, "min": 1180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 47, "sum": 47.0, "min": 47}}, "EndTime": 1572781655.402659, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 46}, "StartTime": 1572781655.301705}

[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=247390.26318 records/second
[2019-11-03 11:47:35.497] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 95, "duration": 94, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 48 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 241, "sum": 241.0, "min": 241}, "Total Records Seen": {"count": 1, "max": 1205000, "sum": 1205000.0, "min": 1205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 48, "sum": 48.0, "min": 48}}, "EndTime": 1572781655.498125, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 47}, "StartTime": 1572781655.402851}

[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=262123.030158 records/second
[2019-11-03 11:47:35.597] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 97, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 49 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 246, "sum": 246.0, "min": 246}, "Total Records Seen": {"count": 1, "max": 1230000, "sum": 1230000.0, "min": 1230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 49, "sum": 49.0, "min": 49}}, "EndTime": 1572781655.597852, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 48}, "StartTime": 1572781655.499737}

[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=254351.928665 records/second
[2019-11-03 11:47:35.291] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 83, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 42 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 211, "sum": 211.0, "min": 211}, "Total Records Seen": {"count": 1, "max": 1055000, "sum": 1055000.0, "min": 1055000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 42, "sum": 42.0, "min": 42}}, "EndTime": 1572781655.291625, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 41}, "StartTime": 1572781655.179141}

[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=221949.970155 records/second
[2019-11-03 11:47:35.408] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 85, "duration": 116, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 43 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 216, "sum": 216.0, "min": 216}, "Total Records Seen": {"count": 1, "max": 1080000, "sum": 1080000.0, "min": 1080000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 43, "sum": 43.0, "min": 43}}, "EndTime": 1572781655.408899, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 42}, "StartTime": 1572781655.291992}

[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=213604.075804 records/second
[2019-11-03 11:47:35.513] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 87, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 44 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 221, "sum": 221.0, "min": 221}, "Total Records Seen": {"count": 1, "max": 1105000, "sum": 1105000.0, "min": 1105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 44, "sum": 44.0, "min": 44}}, "EndTime": 1572781655.513764, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 43}, "StartTime": 1572781655.40914}

[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=238626.738367 records/second
[2019-11-03 11:47:35.637] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 89, "duration": 121, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 45 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 226, "sum": 226.0, "min": 226}, "Total Records Seen": {"count": 1, "max": 1130000, "sum": 1130000.0, "min": 1130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 45, "sum": 45.0, "min": 45}}, "EndTime": 1572781655.637705, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 44}, "StartTime": 1572781655.514016}

[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=201894.610373 records/second
[2019-11-03 11:47:35.743] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 91, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 46 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 231, "sum": 231.0, "min": 231}, "Total Records Seen": {"count": 1, "max": 1155000, "sum": 1155000.0, "min": 1155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 46, "sum": 46.0, "min": 46}}, "EndTime": 1572781655.744241, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 45}, "StartTime": 1572781655.637956}

[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=234933.94992 records/second
[2019-11-03 11:47:35.854] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 93, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 47 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 236, "sum": 236.0, "min": 236}, "Total Records Seen": {"count": 1, "max": 1180000, "sum": 1180000.0, "min": 1180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 47, "sum": 47.0, "min": 47}}, "EndTime": 1572781655.854611, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 46}, "StartTime": 1572781655.744479}

[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=226738.744973 records/second
[2019-11-03 11:47:35.963] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 95, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140169171593024] #progress_metric: host=algo-1, completed 48 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 241, "sum": 241.0, "min": 241}, "Total Records Seen": {"count": 1, "max": 1205000, "sum": 1205000.0, "min": 1205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 48, "sum": 48.0, "min": 48}}, "EndTime": 1572781655.96422, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 47}, "StartTime": 1572781655.854849}

[11/03/2019 11:47:35 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=228308.159928 records/second
[2019-11-03 11:47:36.080] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 97, "duration": 116, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 49 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 246, "sum": 246.0, "min": 246}, "Total Records Seen": {"count": 1, "max": 1230000, "sum": 1230000.0, "min": 1230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 49, "sum": 49.0, "min": 49}}, "EndTime": 1572781656.081254, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 48}, "StartTime": 1572781655.964466}

[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=213826.658999 records/second
[2019-11-03 11:47:36.195] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 99, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 50 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 251, "sum": 251.0, "min": 251}, "Total Records Seen": {"count": 1, "max": 1255000, "sum": 1255000.0, "min": 1255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 50, "sum": 50.0, "min": 50}}, "EndTime": 1572781656.196875, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 49}, "StartTime": 1572781656.081495}

[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=216472.608961 records/second
[2019-11-03 11:47:35.695] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 99, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 50 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 251, "sum": 251.0, "min": 251}, "Total Records Seen": {"count": 1, "max": 1255000, "sum": 1255000.0, "min": 1255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 50, "sum": 50.0, "min": 50}}, "EndTime": 1572781655.69541, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 49}, "StartTime": 1572781655.599632}

[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=260737.322147 records/second
[2019-11-03 11:47:35.811] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 101, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 51 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 256, "sum": 256.0, "min": 256}, "Total Records Seen": {"count": 1, "max": 1280000, "sum": 1280000.0, "min": 1280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 51, "sum": 51.0, "min": 51}}, "EndTime": 1572781655.811704, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 50}, "StartTime": 1572781655.697071}

[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=217885.922078 records/second
[2019-11-03 11:47:35.926] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 103, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:35 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:35 INFO 140552810366784] #progress_metric: host=algo-2, completed 52 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 261, "sum": 261.0, "min": 261}, "Total Records Seen": {"count": 1, "max": 1305000, "sum": 1305000.0, "min": 1305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 52, "sum": 52.0, "min": 52}}, "EndTime": 1572781655.926947, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 51}, "StartTime": 1572781655.813594}

[11/03/2019 11:47:35 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=220338.142528 records/second
[2019-11-03 11:47:36.025] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 105, "duration": 96, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 53 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 266, "sum": 266.0, "min": 266}, "Total Records Seen": {"count": 1, "max": 1330000, "sum": 1330000.0, "min": 1330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 53, "sum": 53.0, "min": 53}}, "EndTime": 1572781656.026239, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 52}, "StartTime": 1572781655.928995}

[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=256794.960963 records/second
[2019-11-03 11:47:36.130] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 107, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 54 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 271, "sum": 271.0, "min": 271}, "Total Records Seen": {"count": 1, "max": 1355000, "sum": 1355000.0, "min": 1355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 54, "sum": 54.0, "min": 54}}, "EndTime": 1572781656.131204, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 53}, "StartTime": 1572781656.027974}

[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=241924.550851 records/second
[2019-11-03 11:47:36.221] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 109, "duration": 90, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 55 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 276, "sum": 276.0, "min": 276}, "Total Records Seen": {"count": 1, "max": 1380000, "sum": 1380000.0, "min": 1380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 55, "sum": 55.0, "min": 55}}, "EndTime": 1572781656.222293, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 54}, "StartTime": 1572781656.131432}

[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=274815.754437 records/second
[2019-11-03 11:47:36.316] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 111, "duration": 93, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 56 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 281, "sum": 281.0, "min": 281}, "Total Records Seen": {"count": 1, "max": 1405000, "sum": 1405000.0, "min": 1405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 56, "sum": 56.0, "min": 56}}, "EndTime": 1572781656.317027, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 55}, "StartTime": 1572781656.222511}

[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=264206.794548 records/second
[2019-11-03 11:47:36.408] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 113, "duration": 89, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 57 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 286, "sum": 286.0, "min": 286}, "Total Records Seen": {"count": 1, "max": 1430000, "sum": 1430000.0, "min": 1430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 57, "sum": 57.0, "min": 57}}, "EndTime": 1572781656.40873, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 56}, "StartTime": 1572781656.318606}

[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=277057.301919 records/second
[2019-11-03 11:47:36.506] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 115, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 58 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 291, "sum": 291.0, "min": 291}, "Total Records Seen": {"count": 1, "max": 1455000, "sum": 1455000.0, "min": 1455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 58, "sum": 58.0, "min": 58}}, "EndTime": 1572781656.507134, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 57}, "StartTime": 1572781656.408955}

[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=254313.064948 records/second
[2019-11-03 11:47:36.611] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 117, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 59 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 296, "sum": 296.0, "min": 296}, "Total Records Seen": {"count": 1, "max": 1480000, "sum": 1480000.0, "min": 1480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 59, "sum": 59.0, "min": 59}}, "EndTime": 1572781656.61187, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 58}, "StartTime": 1572781656.508761}

[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=242210.667584 records/second
[2019-11-03 11:47:36.315] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 101, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 51 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 256, "sum": 256.0, "min": 256}, "Total Records Seen": {"count": 1, "max": 1280000, "sum": 1280000.0, "min": 1280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 51, "sum": 51.0, "min": 51}}, "EndTime": 1572781656.316061, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 50}, "StartTime": 1572781656.19707}

[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=209839.005013 records/second
[2019-11-03 11:47:36.435] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 103, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 52 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 261, "sum": 261.0, "min": 261}, "Total Records Seen": {"count": 1, "max": 1305000, "sum": 1305000.0, "min": 1305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 52, "sum": 52.0, "min": 52}}, "EndTime": 1572781656.435687, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 51}, "StartTime": 1572781656.316337}

[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=209205.157825 records/second
[2019-11-03 11:47:36.550] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 105, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 53 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 266, "sum": 266.0, "min": 266}, "Total Records Seen": {"count": 1, "max": 1330000, "sum": 1330000.0, "min": 1330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 53, "sum": 53.0, "min": 53}}, "EndTime": 1572781656.550555, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 52}, "StartTime": 1572781656.436036}

[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=218056.742633 records/second
[2019-11-03 11:47:36.667] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 107, "duration": 116, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 54 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 271, "sum": 271.0, "min": 271}, "Total Records Seen": {"count": 1, "max": 1355000, "sum": 1355000.0, "min": 1355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 54, "sum": 54.0, "min": 54}}, "EndTime": 1572781656.668235, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 53}, "StartTime": 1572781656.550799}

[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=212650.198033 records/second
[2019-11-03 11:47:36.783] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 109, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 55 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 276, "sum": 276.0, "min": 276}, "Total Records Seen": {"count": 1, "max": 1380000, "sum": 1380000.0, "min": 1380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 55, "sum": 55.0, "min": 55}}, "EndTime": 1572781656.783669, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 54}, "StartTime": 1572781656.668477}

[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=216786.336732 records/second
[2019-11-03 11:47:36.893] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 111, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140169171593024] #progress_metric: host=algo-1, completed 56 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 281, "sum": 281.0, "min": 281}, "Total Records Seen": {"count": 1, "max": 1405000, "sum": 1405000.0, "min": 1405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 56, "sum": 56.0, "min": 56}}, "EndTime": 1572781656.894322, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 55}, "StartTime": 1572781656.783985}

[11/03/2019 11:47:36 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=226315.925788 records/second
[2019-11-03 11:47:37.002] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 113, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 57 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 286, "sum": 286.0, "min": 286}, "Total Records Seen": {"count": 1, "max": 1430000, "sum": 1430000.0, "min": 1430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 57, "sum": 57.0, "min": 57}}, "EndTime": 1572781657.002531, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 56}, "StartTime": 1572781656.894559}

[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=231273.599887 records/second
[2019-11-03 11:47:37.117] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 115, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 58 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 291, "sum": 291.0, "min": 291}, "Total Records Seen": {"count": 1, "max": 1455000, "sum": 1455000.0, "min": 1455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 58, "sum": 58.0, "min": 58}}, "EndTime": 1572781657.117718, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 57}, "StartTime": 1572781657.004481}

[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=220530.454552 records/second
[2019-11-03 11:47:37.229] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 117, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 59 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 296, "sum": 296.0, "min": 296}, "Total Records Seen": {"count": 1, "max": 1480000, "sum": 1480000.0, "min": 1480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 59, "sum": 59.0, "min": 59}}, "EndTime": 1572781657.229763, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 58}, "StartTime": 1572781657.119333}

[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=226135.826937 records/second
[2019-11-03 11:47:37.346] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 119, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 60 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 301, "sum": 301.0, "min": 301}, "Total Records Seen": {"count": 1, "max": 1505000, "sum": 1505000.0, "min": 1505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 60, "sum": 60.0, "min": 60}}, "EndTime": 1572781657.346454, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 59}, "StartTime": 1572781657.231374}

[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=217004.377024 records/second
[2019-11-03 11:47:37.458] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 121, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 61 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 306, "sum": 306.0, "min": 306}, "Total Records Seen": {"count": 1, "max": 1530000, "sum": 1530000.0, "min": 1530000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 61, "sum": 61.0, "min": 61}}, "EndTime": 1572781657.459096, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 60}, "StartTime": 1572781657.346694}

[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=222127.695632 records/second
[2019-11-03 11:47:37.564] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 123, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 62 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 311, "sum": 311.0, "min": 311}, "Total Records Seen": {"count": 1, "max": 1555000, "sum": 1555000.0, "min": 1555000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 62, "sum": 62.0, "min": 62}}, "EndTime": 1572781657.56538, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 61}, "StartTime": 1572781657.459356}

[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=235513.85916 records/second
[2019-11-03 11:47:37.674] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 125, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 63 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 316, "sum": 316.0, "min": 316}, "Total Records Seen": {"count": 1, "max": 1580000, "sum": 1580000.0, "min": 1580000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 63, "sum": 63.0, "min": 63}}, "EndTime": 1572781657.674904, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 62}, "StartTime": 1572781657.565617}

[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=228475.307499 records/second
[2019-11-03 11:47:37.780] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 127, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 64 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 321, "sum": 321.0, "min": 321}, "Total Records Seen": {"count": 1, "max": 1605000, "sum": 1605000.0, "min": 1605000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 64, "sum": 64.0, "min": 64}}, "EndTime": 1572781657.780621, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 63}, "StartTime": 1572781657.675158}

[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=236744.830825 records/second
[2019-11-03 11:47:37.902] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 129, "duration": 121, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140169171593024] #progress_metric: host=algo-1, completed 65 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 326, "sum": 326.0, "min": 326}, "Total Records Seen": {"count": 1, "max": 1630000, "sum": 1630000.0, "min": 1630000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 65, "sum": 65.0, "min": 65}}, "EndTime": 1572781657.903117, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 64}, "StartTime": 1572781657.78087}

[11/03/2019 11:47:37 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=204263.409598 records/second
[2019-11-03 11:47:38.008] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 131, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 66 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 331, "sum": 331.0, "min": 331}, "Total Records Seen": {"count": 1, "max": 1655000, "sum": 1655000.0, "min": 1655000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 66, "sum": 66.0, "min": 66}}, "EndTime": 1572781658.009231, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 65}, "StartTime": 1572781657.90343}

[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=235951.071548 records/second
[2019-11-03 11:47:38.115] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 133, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 67 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 336, "sum": 336.0, "min": 336}, "Total Records Seen": {"count": 1, "max": 1680000, "sum": 1680000.0, "min": 1680000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 67, "sum": 67.0, "min": 67}}, "EndTime": 1572781658.115497, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 66}, "StartTime": 1572781658.009521}

[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=235577.882222 records/second
[2019-11-03 11:47:38.218] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 135, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 68 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 341, "sum": 341.0, "min": 341}, "Total Records Seen": {"count": 1, "max": 1705000, "sum": 1705000.0, "min": 1705000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 68, "sum": 68.0, "min": 68}}, "EndTime": 1572781658.218867, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 67}, "StartTime": 1572781658.115755}

[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=242118.947176 records/second
[2019-11-03 11:47:36.720] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 119, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 60 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 301, "sum": 301.0, "min": 301}, "Total Records Seen": {"count": 1, "max": 1505000, "sum": 1505000.0, "min": 1505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 60, "sum": 60.0, "min": 60}}, "EndTime": 1572781656.720562, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 59}, "StartTime": 1572781656.613444}

[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=233106.50539 records/second
[2019-11-03 11:47:36.829] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 121, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 61 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 306, "sum": 306.0, "min": 306}, "Total Records Seen": {"count": 1, "max": 1530000, "sum": 1530000.0, "min": 1530000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 61, "sum": 61.0, "min": 61}}, "EndTime": 1572781656.829433, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 60}, "StartTime": 1572781656.720761}

[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=229819.839565 records/second
[2019-11-03 11:47:36.947] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 123, "duration": 115, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:36 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:36 INFO 140552810366784] #progress_metric: host=algo-2, completed 62 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 311, "sum": 311.0, "min": 311}, "Total Records Seen": {"count": 1, "max": 1555000, "sum": 1555000.0, "min": 1555000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 62, "sum": 62.0, "min": 62}}, "EndTime": 1572781656.94762, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 61}, "StartTime": 1572781656.831255}

[11/03/2019 11:47:36 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=214647.367204 records/second
[2019-11-03 11:47:37.049] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 125, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 63 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 316, "sum": 316.0, "min": 316}, "Total Records Seen": {"count": 1, "max": 1580000, "sum": 1580000.0, "min": 1580000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 63, "sum": 63.0, "min": 63}}, "EndTime": 1572781657.050018, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 62}, "StartTime": 1572781656.949825}

[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=249187.496138 records/second
[2019-11-03 11:47:37.145] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 127, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 64 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 321, "sum": 321.0, "min": 321}, "Total Records Seen": {"count": 1, "max": 1605000, "sum": 1605000.0, "min": 1605000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 64, "sum": 64.0, "min": 64}}, "EndTime": 1572781657.146114, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 63}, "StartTime": 1572781657.050269}

[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=260495.066231 records/second
[2019-11-03 11:47:37.247] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 129, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 65 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 326, "sum": 326.0, "min": 326}, "Total Records Seen": {"count": 1, "max": 1630000, "sum": 1630000.0, "min": 1630000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 65, "sum": 65.0, "min": 65}}, "EndTime": 1572781657.247753, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 64}, "StartTime": 1572781657.14635}

[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=246215.691385 records/second
[2019-11-03 11:47:37.343] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 131, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 66 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 331, "sum": 331.0, "min": 331}, "Total Records Seen": {"count": 1, "max": 1655000, "sum": 1655000.0, "min": 1655000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 66, "sum": 66.0, "min": 66}}, "EndTime": 1572781657.344179, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 65}, "StartTime": 1572781657.248007}

[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=259551.727125 records/second
[2019-11-03 11:47:37.451] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 133, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 67 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 336, "sum": 336.0, "min": 336}, "Total Records Seen": {"count": 1, "max": 1680000, "sum": 1680000.0, "min": 1680000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 67, "sum": 67.0, "min": 67}}, "EndTime": 1572781657.451658, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 66}, "StartTime": 1572781657.344442}

[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=232884.920767 records/second
[2019-11-03 11:47:37.550] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 135, "duration": 98, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 68 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 341, "sum": 341.0, "min": 341}, "Total Records Seen": {"count": 1, "max": 1705000, "sum": 1705000.0, "min": 1705000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 68, "sum": 68.0, "min": 68}}, "EndTime": 1572781657.551124, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 67}, "StartTime": 1572781657.451914}

[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=251663.4746 records/second
[2019-11-03 11:47:37.652] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 137, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 69 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 346, "sum": 346.0, "min": 346}, "Total Records Seen": {"count": 1, "max": 1730000, "sum": 1730000.0, "min": 1730000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 69, "sum": 69.0, "min": 69}}, "EndTime": 1572781657.652952, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 68}, "StartTime": 1572781657.551352}

[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=245806.472786 records/second
[2019-11-03 11:47:37.745] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 139, "duration": 91, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 70 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 351, "sum": 351.0, "min": 351}, "Total Records Seen": {"count": 1, "max": 1755000, "sum": 1755000.0, "min": 1755000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 70, "sum": 70.0, "min": 70}}, "EndTime": 1572781657.745565, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 69}, "StartTime": 1572781657.653178}

[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=270211.850321 records/second
[2019-11-03 11:47:37.839] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 141, "duration": 93, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 71 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 356, "sum": 356.0, "min": 356}, "Total Records Seen": {"count": 1, "max": 1780000, "sum": 1780000.0, "min": 1780000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 71, "sum": 71.0, "min": 71}}, "EndTime": 1572781657.839737, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 70}, "StartTime": 1572781657.745817}

[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=265818.946941 records/second
[2019-11-03 11:47:37.946] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 143, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:37 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:37 INFO 140552810366784] #progress_metric: host=algo-2, completed 72 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 361, "sum": 361.0, "min": 361}, "Total Records Seen": {"count": 1, "max": 1805000, "sum": 1805000.0, "min": 1805000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 72, "sum": 72.0, "min": 72}}, "EndTime": 1572781657.94681, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 71}, "StartTime": 1572781657.839979}

[11/03/2019 11:47:37 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=233723.773457 records/second
[2019-11-03 11:47:38.048] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 145, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 73 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 366, "sum": 366.0, "min": 366}, "Total Records Seen": {"count": 1, "max": 1830000, "sum": 1830000.0, "min": 1830000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 73, "sum": 73.0, "min": 73}}, "EndTime": 1572781658.048629, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 72}, "StartTime": 1572781657.947054}

[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=245784.578458 records/second
[2019-11-03 11:47:38.153] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 147, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 74 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 371, "sum": 371.0, "min": 371}, "Total Records Seen": {"count": 1, "max": 1855000, "sum": 1855000.0, "min": 1855000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 74, "sum": 74.0, "min": 74}}, "EndTime": 1572781658.153764, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 73}, "StartTime": 1572781658.048879}

[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=238061.139932 records/second
[2019-11-03 11:47:38.257] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 149, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 75 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 376, "sum": 376.0, "min": 376}, "Total Records Seen": {"count": 1, "max": 1880000, "sum": 1880000.0, "min": 1880000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 75, "sum": 75.0, "min": 75}}, "EndTime": 1572781658.25836, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 74}, "StartTime": 1572781658.154126}

[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=239560.073017 records/second
[2019-11-03 11:47:38.361] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 151, "duration": 101, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 76 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 381, "sum": 381.0, "min": 381}, "Total Records Seen": {"count": 1, "max": 1905000, "sum": 1905000.0, "min": 1905000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 76, "sum": 76.0, "min": 76}}, "EndTime": 1572781658.362082, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 75}, "StartTime": 1572781658.258831}

[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=241788.993576 records/second
[2019-11-03 11:47:38.461] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 153, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 77 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 386, "sum": 386.0, "min": 386}, "Total Records Seen": {"count": 1, "max": 1930000, "sum": 1930000.0, "min": 1930000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 77, "sum": 77.0, "min": 77}}, "EndTime": 1572781658.462326, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 76}, "StartTime": 1572781658.362336}

[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=249697.812534 records/second
[2019-11-03 11:47:38.572] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 155, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 78 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 391, "sum": 391.0, "min": 391}, "Total Records Seen": {"count": 1, "max": 1955000, "sum": 1955000.0, "min": 1955000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 78, "sum": 78.0, "min": 78}}, "EndTime": 1572781658.573086, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 77}, "StartTime": 1572781658.462574}

[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=225957.962151 records/second
[2019-11-03 11:47:38.327] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 137, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 69 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 346, "sum": 346.0, "min": 346}, "Total Records Seen": {"count": 1, "max": 1730000, "sum": 1730000.0, "min": 1730000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 69, "sum": 69.0, "min": 69}}, "EndTime": 1572781658.327902, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 68}, "StartTime": 1572781658.219121}

[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=229541.126148 records/second
[2019-11-03 11:47:38.434] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 139, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 70 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 351, "sum": 351.0, "min": 351}, "Total Records Seen": {"count": 1, "max": 1755000, "sum": 1755000.0, "min": 1755000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 70, "sum": 70.0, "min": 70}}, "EndTime": 1572781658.435265, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 69}, "StartTime": 1572781658.328147}

[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=233102.359754 records/second
[2019-11-03 11:47:38.545] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 141, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 71 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 356, "sum": 356.0, "min": 356}, "Total Records Seen": {"count": 1, "max": 1780000, "sum": 1780000.0, "min": 1780000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 71, "sum": 71.0, "min": 71}}, "EndTime": 1572781658.54583, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 70}, "StartTime": 1572781658.435508}

[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=226338.885807 records/second
[2019-11-03 11:47:38.658] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 143, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 72 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 361, "sum": 361.0, "min": 361}, "Total Records Seen": {"count": 1, "max": 1805000, "sum": 1805000.0, "min": 1805000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 72, "sum": 72.0, "min": 72}}, "EndTime": 1572781658.658848, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 71}, "StartTime": 1572781658.546075}

[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=221397.458284 records/second
[2019-11-03 11:47:38.770] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 145, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 73 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 366, "sum": 366.0, "min": 366}, "Total Records Seen": {"count": 1, "max": 1830000, "sum": 1830000.0, "min": 1830000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 73, "sum": 73.0, "min": 73}}, "EndTime": 1572781658.771435, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 72}, "StartTime": 1572781658.659151}

[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=222358.50457 records/second
[2019-11-03 11:47:38.893] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 147, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140169171593024] #progress_metric: host=algo-1, completed 74 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 371, "sum": 371.0, "min": 371}, "Total Records Seen": {"count": 1, "max": 1855000, "sum": 1855000.0, "min": 1855000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 74, "sum": 74.0, "min": 74}}, "EndTime": 1572781658.893545, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 73}, "StartTime": 1572781658.771688}

[11/03/2019 11:47:38 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=204919.669885 records/second
[2019-11-03 11:47:39.014] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 149, "duration": 119, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 75 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 376, "sum": 376.0, "min": 376}, "Total Records Seen": {"count": 1, "max": 1880000, "sum": 1880000.0, "min": 1880000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 75, "sum": 75.0, "min": 75}}, "EndTime": 1572781659.01457, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 74}, "StartTime": 1572781658.893806}

[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=206791.172816 records/second
[2019-11-03 11:47:39.134] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 151, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 76 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 381, "sum": 381.0, "min": 381}, "Total Records Seen": {"count": 1, "max": 1905000, "sum": 1905000.0, "min": 1905000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 76, "sum": 76.0, "min": 76}}, "EndTime": 1572781659.134788, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 75}, "StartTime": 1572781659.014817}

[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=207656.90476 records/second
[2019-11-03 11:47:38.679] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 157, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 79 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 396, "sum": 396.0, "min": 396}, "Total Records Seen": {"count": 1, "max": 1980000, "sum": 1980000.0, "min": 1980000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 79, "sum": 79.0, "min": 79}}, "EndTime": 1572781658.679666, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 78}, "StartTime": 1572781658.57351}

[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=235247.558516 records/second
[2019-11-03 11:47:38.783] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 159, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 80 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 401, "sum": 401.0, "min": 401}, "Total Records Seen": {"count": 1, "max": 2005000, "sum": 2005000.0, "min": 2005000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 80, "sum": 80.0, "min": 80}}, "EndTime": 1572781658.783399, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 79}, "StartTime": 1572781658.68004}

[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=241622.405081 records/second
[2019-11-03 11:47:38.878] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 161, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 81 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 406, "sum": 406.0, "min": 406}, "Total Records Seen": {"count": 1, "max": 2030000, "sum": 2030000.0, "min": 2030000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 81, "sum": 81.0, "min": 81}}, "EndTime": 1572781658.879164, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 80}, "StartTime": 1572781658.783612}

[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=261350.800321 records/second
[2019-11-03 11:47:38.987] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 163, "duration": 108, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:38 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:38 INFO 140552810366784] #progress_metric: host=algo-2, completed 82 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 411, "sum": 411.0, "min": 411}, "Total Records Seen": {"count": 1, "max": 2055000, "sum": 2055000.0, "min": 2055000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 82, "sum": 82.0, "min": 82}}, "EndTime": 1572781658.988273, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 81}, "StartTime": 1572781658.879363}

[11/03/2019 11:47:38 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=229286.649859 records/second
[2019-11-03 11:47:39.083] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 165, "duration": 95, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 83 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 416, "sum": 416.0, "min": 416}, "Total Records Seen": {"count": 1, "max": 2080000, "sum": 2080000.0, "min": 2080000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 83, "sum": 83.0, "min": 83}}, "EndTime": 1572781659.084386, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 82}, "StartTime": 1572781658.988487}

[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=260354.065798 records/second
[2019-11-03 11:47:39.195] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 167, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 84 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 421, "sum": 421.0, "min": 421}, "Total Records Seen": {"count": 1, "max": 2105000, "sum": 2105000.0, "min": 2105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 84, "sum": 84.0, "min": 84}}, "EndTime": 1572781659.196119, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 83}, "StartTime": 1572781659.086359}

[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=227550.123586 records/second
[2019-11-03 11:47:39.306] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 169, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 85 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 426, "sum": 426.0, "min": 426}, "Total Records Seen": {"count": 1, "max": 2130000, "sum": 2130000.0, "min": 2130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 85, "sum": 85.0, "min": 85}}, "EndTime": 1572781659.306819, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 84}, "StartTime": 1572781659.196313}

[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=226019.330419 records/second
[2019-11-03 11:47:39.418] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 171, "duration": 111, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 86 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 431, "sum": 431.0, "min": 431}, "Total Records Seen": {"count": 1, "max": 2155000, "sum": 2155000.0, "min": 2155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 86, "sum": 86.0, "min": 86}}, "EndTime": 1572781659.418981, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 85}, "StartTime": 1572781659.307034}

[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=223112.669583 records/second
[2019-11-03 11:47:39.535] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 173, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 87 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 436, "sum": 436.0, "min": 436}, "Total Records Seen": {"count": 1, "max": 2180000, "sum": 2180000.0, "min": 2180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 87, "sum": 87.0, "min": 87}}, "EndTime": 1572781659.535613, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 86}, "StartTime": 1572781659.421016}

[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=217954.308781 records/second
[2019-11-03 11:47:39.653] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 175, "duration": 116, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 88 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 441, "sum": 441.0, "min": 441}, "Total Records Seen": {"count": 1, "max": 2205000, "sum": 2205000.0, "min": 2205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 88, "sum": 88.0, "min": 88}}, "EndTime": 1572781659.654219, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 87}, "StartTime": 1572781659.537496}

[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=213978.507789 records/second
[2019-11-03 11:47:39.242] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 153, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 77 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 386, "sum": 386.0, "min": 386}, "Total Records Seen": {"count": 1, "max": 1930000, "sum": 1930000.0, "min": 1930000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 77, "sum": 77.0, "min": 77}}, "EndTime": 1572781659.242942, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 76}, "StartTime": 1572781659.135533}

[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=232437.345108 records/second
[2019-11-03 11:47:39.362] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 155, "duration": 117, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 78 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 391, "sum": 391.0, "min": 391}, "Total Records Seen": {"count": 1, "max": 1955000, "sum": 1955000.0, "min": 1955000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 78, "sum": 78.0, "min": 78}}, "EndTime": 1572781659.36302, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 77}, "StartTime": 1572781659.244852}

[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=211332.314873 records/second
[2019-11-03 11:47:39.468] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 157, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 79 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 396, "sum": 396.0, "min": 396}, "Total Records Seen": {"count": 1, "max": 1980000, "sum": 1980000.0, "min": 1980000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 79, "sum": 79.0, "min": 79}}, "EndTime": 1572781659.468665, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 78}, "StartTime": 1572781659.363264}

[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=236905.829698 records/second
[2019-11-03 11:47:39.579] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 159, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 80 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 401, "sum": 401.0, "min": 401}, "Total Records Seen": {"count": 1, "max": 2005000, "sum": 2005000.0, "min": 2005000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 80, "sum": 80.0, "min": 80}}, "EndTime": 1572781659.579519, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 79}, "StartTime": 1572781659.468902}

[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=225727.398758 records/second
[2019-11-03 11:47:39.694] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 161, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 81 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 406, "sum": 406.0, "min": 406}, "Total Records Seen": {"count": 1, "max": 2030000, "sum": 2030000.0, "min": 2030000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 81, "sum": 81.0, "min": 81}}, "EndTime": 1572781659.695116, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 80}, "StartTime": 1572781659.579765}

[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=216469.033856 records/second
[2019-11-03 11:47:39.808] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 163, "duration": 112, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 82 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 411, "sum": 411.0, "min": 411}, "Total Records Seen": {"count": 1, "max": 2055000, "sum": 2055000.0, "min": 2055000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 82, "sum": 82.0, "min": 82}}, "EndTime": 1572781659.80906, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 81}, "StartTime": 1572781659.695364}

[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=219606.72616 records/second
[2019-11-03 11:47:39.924] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 165, "duration": 114, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140169171593024] #progress_metric: host=algo-1, completed 83 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 416, "sum": 416.0, "min": 416}, "Total Records Seen": {"count": 1, "max": 2080000, "sum": 2080000.0, "min": 2080000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 83, "sum": 83.0, "min": 83}}, "EndTime": 1572781659.925035, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 82}, "StartTime": 1572781659.809357}

[11/03/2019 11:47:39 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=215857.64515 records/second
[2019-11-03 11:47:40.033] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 167, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 84 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 421, "sum": 421.0, "min": 421}, "Total Records Seen": {"count": 1, "max": 2105000, "sum": 2105000.0, "min": 2105000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 84, "sum": 84.0, "min": 84}}, "EndTime": 1572781660.033847, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 83}, "StartTime": 1572781659.925312}

[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=230064.900587 records/second
[2019-11-03 11:47:40.160] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 169, "duration": 124, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 85 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 426, "sum": 426.0, "min": 426}, "Total Records Seen": {"count": 1, "max": 2130000, "sum": 2130000.0, "min": 2130000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 85, "sum": 85.0, "min": 85}}, "EndTime": 1572781660.160738, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 84}, "StartTime": 1572781660.034084}

[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=197187.113905 records/second
[2019-11-03 11:47:39.753] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 177, "duration": 97, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 89 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 446, "sum": 446.0, "min": 446}, "Total Records Seen": {"count": 1, "max": 2230000, "sum": 2230000.0, "min": 2230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 89, "sum": 89.0, "min": 89}}, "EndTime": 1572781659.754287, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 88}, "StartTime": 1572781659.656077}

[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=254284.695766 records/second
[2019-11-03 11:47:39.844] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 179, "duration": 90, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 90 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 451, "sum": 451.0, "min": 451}, "Total Records Seen": {"count": 1, "max": 2255000, "sum": 2255000.0, "min": 2255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 90, "sum": 90.0, "min": 90}}, "EndTime": 1572781659.845219, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 89}, "StartTime": 1572781659.75448}

[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=275145.303451 records/second
[2019-11-03 11:47:39.953] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 181, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:39 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:39 INFO 140552810366784] #progress_metric: host=algo-2, completed 91 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 456, "sum": 456.0, "min": 456}, "Total Records Seen": {"count": 1, "max": 2280000, "sum": 2280000.0, "min": 2280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 91, "sum": 91.0, "min": 91}}, "EndTime": 1572781659.953676, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 90}, "StartTime": 1572781659.845461}

[11/03/2019 11:47:39 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=230744.313781 records/second
[2019-11-03 11:47:40.060] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 183, "duration": 104, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 92 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 461, "sum": 461.0, "min": 461}, "Total Records Seen": {"count": 1, "max": 2305000, "sum": 2305000.0, "min": 2305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 92, "sum": 92.0, "min": 92}}, "EndTime": 1572781660.060468, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 91}, "StartTime": 1572781659.955789}

[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=238536.083788 records/second
[2019-11-03 11:47:40.165] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 185, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 93 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 466, "sum": 466.0, "min": 466}, "Total Records Seen": {"count": 1, "max": 2330000, "sum": 2330000.0, "min": 2330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 93, "sum": 93.0, "min": 93}}, "EndTime": 1572781660.165692, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 92}, "StartTime": 1572781660.062342}

[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=241597.353105 records/second
[2019-11-03 11:47:40.269] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 187, "duration": 101, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 94 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 471, "sum": 471.0, "min": 471}, "Total Records Seen": {"count": 1, "max": 2355000, "sum": 2355000.0, "min": 2355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 94, "sum": 94.0, "min": 94}}, "EndTime": 1572781660.269801, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 93}, "StartTime": 1572781660.167789}

[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=244587.508631 records/second
[2019-11-03 11:47:40.373] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 189, "duration": 101, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 95 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 476, "sum": 476.0, "min": 476}, "Total Records Seen": {"count": 1, "max": 2380000, "sum": 2380000.0, "min": 2380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 95, "sum": 95.0, "min": 95}}, "EndTime": 1572781660.374038, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 94}, "StartTime": 1572781660.271758}

[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=244131.376699 records/second
[2019-11-03 11:47:40.472] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 191, "duration": 98, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 96 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 481, "sum": 481.0, "min": 481}, "Total Records Seen": {"count": 1, "max": 2405000, "sum": 2405000.0, "min": 2405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 96, "sum": 96.0, "min": 96}}, "EndTime": 1572781660.473244, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 95}, "StartTime": 1572781660.374283}

[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=252290.783452 records/second
[2019-11-03 11:47:40.584] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 193, "duration": 109, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 97 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 486, "sum": 486.0, "min": 486}, "Total Records Seen": {"count": 1, "max": 2430000, "sum": 2430000.0, "min": 2430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 97, "sum": 97.0, "min": 97}}, "EndTime": 1572781660.585374, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 96}, "StartTime": 1572781660.475085}

[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=226414.149157 records/second
[2019-11-03 11:47:40.268] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 171, "duration": 107, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 86 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 431, "sum": 431.0, "min": 431}, "Total Records Seen": {"count": 1, "max": 2155000, "sum": 2155000.0, "min": 2155000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 86, "sum": 86.0, "min": 86}}, "EndTime": 1572781660.269015, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 85}, "StartTime": 1572781660.160977}

[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=231130.351597 records/second
[2019-11-03 11:47:40.371] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 173, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 87 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 436, "sum": 436.0, "min": 436}, "Total Records Seen": {"count": 1, "max": 2180000, "sum": 2180000.0, "min": 2180000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 87, "sum": 87.0, "min": 87}}, "EndTime": 1572781660.372365, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 86}, "StartTime": 1572781660.269253}

[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=242147.462543 records/second
[2019-11-03 11:47:40.486] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 175, "duration": 113, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 88 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 441, "sum": 441.0, "min": 441}, "Total Records Seen": {"count": 1, "max": 2205000, "sum": 2205000.0, "min": 2205000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 88, "sum": 88.0, "min": 88}}, "EndTime": 1572781660.487108, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 87}, "StartTime": 1572781660.372609}

[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=217679.211637 records/second
[2019-11-03 11:47:40.595] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 177, "duration": 106, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 89 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 446, "sum": 446.0, "min": 446}, "Total Records Seen": {"count": 1, "max": 2230000, "sum": 2230000.0, "min": 2230000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 89, "sum": 89.0, "min": 89}}, "EndTime": 1572781660.596009, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 88}, "StartTime": 1572781660.489044}

[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=233424.603864 records/second
[2019-11-03 11:47:40.715] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 179, "duration": 118, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 90 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 451, "sum": 451.0, "min": 451}, "Total Records Seen": {"count": 1, "max": 2255000, "sum": 2255000.0, "min": 2255000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 90, "sum": 90.0, "min": 90}}, "EndTime": 1572781660.716018, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 89}, "StartTime": 1572781660.596258}

[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=208517.47562 records/second
[2019-11-03 11:47:40.854] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 181, "duration": 138, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 91 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 456, "sum": 456.0, "min": 456}, "Total Records Seen": {"count": 1, "max": 2280000, "sum": 2280000.0, "min": 2280000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 91, "sum": 91.0, "min": 91}}, "EndTime": 1572781660.855434, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 90}, "StartTime": 1572781660.716272}

[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=179375.916273 records/second
[2019-11-03 11:47:40.981] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 183, "duration": 124, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140169171593024] #progress_metric: host=algo-1, completed 92 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 461, "sum": 461.0, "min": 461}, "Total Records Seen": {"count": 1, "max": 2305000, "sum": 2305000.0, "min": 2305000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 92, "sum": 92.0, "min": 92}}, "EndTime": 1572781660.981819, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 91}, "StartTime": 1572781660.855922}

[11/03/2019 11:47:40 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=198300.994743 records/second
[2019-11-03 11:47:41.087] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 185, "duration": 102, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 93 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 466, "sum": 466.0, "min": 466}, "Total Records Seen": {"count": 1, "max": 2330000, "sum": 2330000.0, "min": 2330000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 93, "sum": 93.0, "min": 93}}, "EndTime": 1572781661.087967, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 92}, "StartTime": 1572781660.984266}

[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=240775.75379 records/second
[2019-11-03 11:47:41.189] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 187, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 94 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 471, "sum": 471.0, "min": 471}, "Total Records Seen": {"count": 1, "max": 2355000, "sum": 2355000.0, "min": 2355000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 94, "sum": 94.0, "min": 94}}, "EndTime": 1572781661.189626, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 93}, "StartTime": 1572781661.089645}

[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=249742.415979 records/second
[2019-11-03 11:47:40.697] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 195, "duration": 110, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 98 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 491, "sum": 491.0, "min": 491}, "Total Records Seen": {"count": 1, "max": 2455000, "sum": 2455000.0, "min": 2455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 98, "sum": 98.0, "min": 98}}, "EndTime": 1572781660.698355, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 97}, "StartTime": 1572781660.587506}

[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=225269.616479 records/second
[2019-11-03 11:47:40.838] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 197, "duration": 137, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 99 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 496, "sum": 496.0, "min": 496}, "Total Records Seen": {"count": 1, "max": 2480000, "sum": 2480000.0, "min": 2480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 99, "sum": 99.0, "min": 99}}, "EndTime": 1572781660.838742, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 98}, "StartTime": 1572781660.700431}

[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=180615.201237 records/second
[2019-11-03 11:47:40.938] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 199, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:40 INFO 140552810366784] processed a total of 25000 examples
[11/03/2019 11:47:40 INFO 140552810366784] #progress_metric: host=algo-2, completed 100 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 501, "sum": 501.0, "min": 501}, "Total Records Seen": {"count": 1, "max": 2505000, "sum": 2505000.0, "min": 2505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 100, "sum": 100.0, "min": 100}}, "EndTime": 1572781660.939309, "Dimensions": {"Host": "algo-2", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 99}, "StartTime": 1572781660.838951}

[11/03/2019 11:47:40 INFO 140552810366784] #throughput_metric: host=algo-2, train throughput=248843.324315 records/second
[2019-11-03 11:47:41.293] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 189, "duration": 101, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 95 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 476, "sum": 476.0, "min": 476}, "Total Records Seen": {"count": 1, "max": 2380000, "sum": 2380000.0, "min": 2380000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 95, "sum": 95.0, "min": 95}}, "EndTime": 1572781661.294615, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 94}, "StartTime": 1572781661.189859}

[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=238360.94119 records/second
[2019-11-03 11:47:41.400] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 191, "duration": 105, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 96 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 481, "sum": 481.0, "min": 481}, "Total Records Seen": {"count": 1, "max": 2405000, "sum": 2405000.0, "min": 2405000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 96, "sum": 96.0, "min": 96}}, "EndTime": 1572781661.401357, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 95}, "StartTime": 1572781661.294858}

[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=234471.130948 records/second
[2019-11-03 11:47:41.505] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 193, "duration": 103, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 97 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 486, "sum": 486.0, "min": 486}, "Total Records Seen": {"count": 1, "max": 2430000, "sum": 2430000.0, "min": 2430000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 97, "sum": 97.0, "min": 97}}, "EndTime": 1572781661.506414, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 96}, "StartTime": 1572781661.401588}

[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=238201.746913 records/second
[2019-11-03 11:47:41.607] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 195, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 98 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 491, "sum": 491.0, "min": 491}, "Total Records Seen": {"count": 1, "max": 2455000, "sum": 2455000.0, "min": 2455000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 98, "sum": 98.0, "min": 98}}, "EndTime": 1572781661.608669, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 97}, "StartTime": 1572781661.506654}

[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=244752.499282 records/second
[2019-11-03 11:47:41.708] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 197, "duration": 99, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 99 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 496, "sum": 496.0, "min": 496}, "Total Records Seen": {"count": 1, "max": 2480000, "sum": 2480000.0, "min": 2480000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 99, "sum": 99.0, "min": 99}}, "EndTime": 1572781661.70883, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 98}, "StartTime": 1572781661.608911}

[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=249809.6486 records/second
[2019-11-03 11:47:41.810] [tensorio] [info] epoch_stats={"data_pipeline": "/opt/ml/input/data/train", "epoch": 199, "duration": 100, "num_examples": 5, "num_bytes": 79100000}
[11/03/2019 11:47:41 INFO 140169171593024] processed a total of 25000 examples
[11/03/2019 11:47:41 INFO 140169171593024] #progress_metric: host=algo-1, completed 100 % of epochs
#metrics {"Metrics": {"Max Batches Seen Between Resets": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Batches Since Last Reset": {"count": 1, "max": 5, "sum": 5.0, "min": 5}, "Number of Records Since Last Reset": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Total Batches Seen": {"count": 1, "max": 501, "sum": 501.0, "min": 501}, "Total Records Seen": {"count": 1, "max": 2505000, "sum": 2505000.0, "min": 2505000}, "Max Records Seen Between Resets": {"count": 1, "max": 25000, "sum": 25000.0, "min": 25000}, "Reset Count": {"count": 1, "max": 100, "sum": 100.0, "min": 100}}, "EndTime": 1572781661.810584, "Dimensions": {"Host": "algo-1", "Meta": "training_data_iter", "Operation": "training", "Algorithm": "AWS/KMeansWebscale", "epoch": 99}, "StartTime": 1572781661.70911}

[11/03/2019 11:47:41 INFO 140169171593024] #throughput_metric: host=algo-1, train throughput=246070.664214 records/second
[11/03/2019 11:47:41 INFO 140169171593024] shrinking 100 centers into 10
[11/03/2019 11:47:41 INFO 140169171593024] local kmeans attempt #0. Current mean square distance 12.902647
[11/03/2019 11:47:41 INFO 140169171593024] local kmeans attempt #1. Current mean square distance 11.803318
[11/03/2019 11:47:41 INFO 140169171593024] local kmeans attempt #2. Current mean square distance 12.321064
[11/03/2019 11:47:41 INFO 140169171593024] local kmeans attempt #3. Current mean square distance 12.036984
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #4. Current mean square distance 12.555333
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #5. Current mean square distance 12.615070
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #6. Current mean square distance 11.918087
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #7. Current mean square distance 12.279174
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #8. Current mean square distance 12.339795
[11/03/2019 11:47:42 INFO 140169171593024] local kmeans attempt #9. Current mean square distance 12.555266
[11/03/2019 11:47:42 INFO 140169171593024] finished shrinking process. Mean Square Distance = 12
[11/03/2019 11:47:42 INFO 140169171593024] #quality_metric: host=algo-1, train msd <loss>=11.8033180237
[11/03/2019 11:47:42 INFO 140169171593024] batch data loading with context took: 38.6209%, (4.388304 secs)
[11/03/2019 11:47:42 INFO 140169171593024] compute all data-center distances: point norm took: 19.0106%, (2.160087 secs)
[11/03/2019 11:47:42 INFO 140169171593024] gradient: cluster center took: 13.1121%, (1.489863 secs)
[11/03/2019 11:47:42 INFO 140169171593024] compute all data-center distances: inner product took: 9.4443%, (1.073109 secs)
[11/03/2019 11:47:42 INFO 140169171593024] collect from kv store took: 5.5164%, (0.626799 secs)
[11/03/2019 11:47:42 INFO 140169171593024] predict compute msd took: 4.7494%, (0.539646 secs)
[11/03/2019 11:47:42 INFO 140169171593024] gradient: cluster size  took: 3.1338%, (0.356081 secs)
[11/03/2019 11:47:42 INFO 140169171593024] splitting centers key-value pair took: 1.9277%, (0.219037 secs)
[11/03/2019 11:47:42 INFO 140169171593024] compute all data-center distances: center norm took: 1.5278%, (0.173592 secs)
[11/03/2019 11:47:42 INFO 140169171593024] gradient: one_hot took: 1.4084%, (0.160024 secs)
[11/03/2019 11:47:42 INFO 140169171593024] update state and report convergance took: 1.3147%, (0.149378 secs)
[11/03/2019 11:47:42 INFO 140169171593024] update set-up time took: 0.1200%, (0.013640 secs)
[11/03/2019 11:47:42 INFO 140169171593024] predict minus dist took: 0.1141%, (0.012959 secs)
[11/03/2019 11:47:42 INFO 140169171593024] TOTAL took: 11.3625204563
[11/03/2019 11:47:42 INFO 140169171593024] Number of GPUs being used: 0
#metrics {"Metrics": {"finalize.time": {"count": 1, "max": 387.3600959777832, "sum": 387.3600959777832, "min": 387.3600959777832}, "initialize.time": {"count": 1, "max": 42.871952056884766, "sum": 42.871952056884766, "min": 42.871952056884766}, "model.serialize.time": {"count": 1, "max": 0.2219676971435547, "sum": 0.2219676971435547, "min": 0.2219676971435547}, "update.time": {"count": 100, "max": 197.33190536499023, "sum": 11322.939395904541, "min": 97.9759693145752}, "epochs": {"count": 1, "max": 100, "sum": 100.0, "min": 100}, "state.serialize.time": {"count": 1, "max": 0.5171298980712891, "sum": 0.5171298980712891, "min": 0.5171298980712891}, "_shrink.time": {"count": 1, "max": 384.3569755554199, "sum": 384.3569755554199, "min": 384.3569755554199}}, "EndTime": 1572781662.199495, "Dimensions": {"Host": "algo-1", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781650.32371}

[11/03/2019 11:47:42 INFO 140169171593024] Test data is not provided.
#metrics {"Metrics": {"totaltime": {"count": 1, "max": 13017.530918121338, "sum": 13017.530918121338, "min": 13017.530918121338}, "setuptime": {"count": 1, "max": 30.853986740112305, "sum": 30.853986740112305, "min": 30.853986740112305}}, "EndTime": 1572781662.202104, "Dimensions": {"Host": "algo-1", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781662.199603}

[11/03/2019 11:47:41 INFO 140552810366784] shrinking 100 centers into 10
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #0. Current mean square distance 12.250052
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #1. Current mean square distance 12.186016
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #2. Current mean square distance 12.200719
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #3. Current mean square distance 11.887745
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #4. Current mean square distance 12.341534
[11/03/2019 11:47:41 INFO 140552810366784] local kmeans attempt #5. Current mean square distance 12.504448
[11/03/2019 11:47:42 INFO 140552810366784] local kmeans attempt #6. Current mean square distance 12.133743
[11/03/2019 11:47:42 INFO 140552810366784] local kmeans attempt #7. Current mean square distance 12.772625
[11/03/2019 11:47:42 INFO 140552810366784] local kmeans attempt #8. Current mean square distance 12.143409
[11/03/2019 11:47:42 INFO 140552810366784] local kmeans attempt #9. Current mean square distance 12.344214
[11/03/2019 11:47:42 INFO 140552810366784] finished shrinking process. Mean Square Distance = 12
[11/03/2019 11:47:42 INFO 140552810366784] #quality_metric: host=algo-2, train msd <loss>=11.8877449036
[11/03/2019 11:47:42 INFO 140552810366784] batch data loading with context took: 31.9681%, (3.320623 secs)
[11/03/2019 11:47:42 INFO 140552810366784] compute all data-center distances: point norm took: 20.7105%, (2.151268 secs)
[11/03/2019 11:47:42 INFO 140552810366784] collect from kv store took: 13.6408%, (1.416910 secs)
[11/03/2019 11:47:42 INFO 140552810366784] gradient: cluster center took: 11.5084%, (1.195417 secs)
[11/03/2019 11:47:42 INFO 140552810366784] compute all data-center distances: inner product took: 9.2459%, (0.960398 secs)
[11/03/2019 11:47:42 INFO 140552810366784] predict compute msd took: 4.4798%, (0.465329 secs)
[11/03/2019 11:47:42 INFO 140552810366784] gradient: cluster size  took: 3.0899%, (0.320962 secs)
[11/03/2019 11:47:42 INFO 140552810366784] gradient: one_hot took: 1.5796%, (0.164074 secs)
[11/03/2019 11:47:42 INFO 140552810366784] update state and report convergance took: 1.2818%, (0.133143 secs)
[11/03/2019 11:47:42 INFO 140552810366784] splitting centers key-value pair took: 1.1349%, (0.117886 secs)
[11/03/2019 11:47:42 INFO 140552810366784] compute all data-center distances: center norm took: 1.1272%, (0.117085 secs)
[11/03/2019 11:47:42 INFO 140552810366784] predict minus dist took: 0.1201%, (0.012476 secs)
[11/03/2019 11:47:42 INFO 140552810366784] update set-up time took: 0.1130%, (0.011741 secs)
[11/03/2019 11:47:42 INFO 140552810366784] TOTAL took: 10.3873124123
[11/03/2019 11:47:42 INFO 140552810366784] Number of GPUs being used: 0
[11/03/2019 11:47:42 INFO 140552810366784] No model is serialized on a non-master node
#metrics {"Metrics": {"finalize.time": {"count": 1, "max": 291.3999557495117, "sum": 291.3999557495117, "min": 291.3999557495117}, "initialize.time": {"count": 1, "max": 41.98312759399414, "sum": 41.98312759399414, "min": 41.98312759399414}, "model.serialize.time": {"count": 1, "max": 0.07700920104980469, "sum": 0.07700920104980469, "min": 0.07700920104980469}, "update.time": {"count": 100, "max": 179.54707145690918, "sum": 10432.80816078186, "min": 89.97201919555664}, "epochs": {"count": 1, "max": 100, "sum": 100.0, "min": 100}, "state.serialize.time": {"count": 1, "max": 0.4820823669433594, "sum": 0.4820823669433594, "min": 0.4820823669433594}, "_shrink.time": {"count": 1, "max": 288.4190082550049, "sum": 288.4190082550049, "min": 288.4190082550049}}, "EndTime": 1572781662.107717, "Dimensions": {"Host": "algo-2", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781650.328628}

[11/03/2019 11:47:42 INFO 140552810366784] Test data is not provided.
#metrics {"Metrics": {"totaltime": {"count": 1, "max": 13907.652139663696, "sum": 13907.652139663696, "min": 13907.652139663696}, "setuptime": {"count": 1, "max": 16.698122024536133, "sum": 16.698122024536133, "min": 16.698122024536133}}, "EndTime": 1572781662.109637, "Dimensions": {"Host": "algo-2", "Operation": "training", "Algorithm": "AWS/KMeansWebscale"}, "StartTime": 1572781662.107824}


2019-11-03 11:47:54 Uploading - Uploading generated training model
2019-11-03 11:47:54 Completed - Training job completed
Training seconds: 142
Billable seconds: 142
CPU times: user 7.93 s, sys: 394 ms, total: 8.33 s
Wall time: 3min 21s
In [9]:
%%time

kmeans_predictor = kmeans.deploy(initial_instance_count=1,
                                 instance_type='ml.m4.xlarge')
--------------------------------------------------------------------------------------------------!CPU times: user 482 ms, sys: 38.7 ms, total: 521 ms
Wall time: 8min 14s
In [10]:
%%time 

result = kmeans_predictor.predict(valid_set[0][0:100])
clusters = [r.label['closest_cluster'].float32_tensor.values[0] for r in result]
CPU times: user 34.1 ms, sys: 353 ┬Ás, total: 34.5 ms
Wall time: 334 ms
In [11]:
for cluster in range(10):
    print('\n\n\nCluster {}:'.format(int(cluster)))
    digits = [ img for l, img in zip(clusters, valid_set[0]) if int(l) == cluster ]
    height = ((len(digits)-1)//5) + 1
    width = 5
    plt.rcParams["figure.figsize"] = (width,height)
    _, subplots = plt.subplots(height, width)
    subplots = numpy.ndarray.flatten(subplots)
    for subplot, image in zip(subplots, digits):
        show_digit(image, subplot=subplot)
    for subplot in subplots[len(digits):]:
        subplot.axis('off')

    plt.show()


Cluster 0:


Cluster 1:


Cluster 2:


Cluster 3:


Cluster 4:


Cluster 5:


Cluster 6:


Cluster 7:


Cluster 8:


Cluster 9:
In [12]:
result = kmeans_predictor.predict(valid_set[0][230:231])
print(result)
[label {
  key: "closest_cluster"
  value {
    float32_tensor {
      values: 4.0
    }
  }
}
label {
  key: "distance_to_cluster"
  value {
    float32_tensor {
      values: 6.309240818023682
    }
  }
}
]
In [13]:
show_digit(valid_set[0][230], 'This is a {}'.format(valid_set[1][230]))
In [ ]: