Skip to content

Latest commit

 

History

History

streaming-pacman

Streaming Pac-Man

Streaming Pac-Man is the funniest application that you ever see while applying principles of streaming analytics using Apache Kafka. Built around the famous Pac-Man game, this application ingest and store events from the game into Kafka topics and allow you to process them in near real-time using ksqlDB. In order to keep you focused this application is based on fully managed services in Confluent Cloud for both the Kafka cluster as well as ksqlDB.

pacman game

To implement streaming analytics in the game we built a scoreboard using ksqlDB. The scoreboard will be a table containing aggregated metrics of the players such as their highest score, the highest level achieved, and the number of times that the player loses. As new events arrive the scoreboard gets instantly updated by the continuous queries that keep processing those events as they happen.

What you are going to need?

  • Java and Maven - The UI layer of the application relies on two APIs that are implemented using Java, therefore you will need to have Java 11+ installed to build the source-code. The build itseld is implemented using Maven, and it is triggered automatically by Terraform.

  • Confluent Cloud CLI - During the demo setup, a bash script will set up the whole Confluent Cloud environment for you. To do that it will need to have the Confluent Cloud CLI installed locally. You can find instructions about how to install it here. v1.7.0 or later is required, logged in with the --save argument which saves your Confluent Cloud user login credentials or refresh token (in the case of SSO) to the local netrc file.

  • Confluent Cloud account - If you do not have a Confluent Cloud account, [create one now](https://www.confluent.io/confluent-cloud/tryfree/).

  • Terraform (0.14+) - The application is automatically created using Terraform. The default cloud provider supported is AWS, but you could port this logic also to GCP and Azure as well. Besides having Terraform installed locally, will need to provide your cloud provider credentials so Terraform can create and manage the resources for you.

Create a Cloud API Key

  1. Open the [Confluent Cloud Console](https://confluent.cloud/settings/api-keys/create) and click Granular access tab, and then click Next.

  2. Click Create a new one to create tab. Enter the new service account name (tf_runner), then click Next.

  3. The Cloud API key and secret are generated for the tf_runner service account. Save your Cloud API key and secret in a secure location. You will need this API key and secret to use the Confluent Terraform Provider.

  4. [Assign](https://confluent.cloud/settings/org/assignments) the OrganizationAdmin role to the tf_runner service account by following [this guide](https://docs.confluent.io/cloud/current/access-management/access-control/cloud-rbac.html#add-a-role-binding-for-a-user-or-service-account).

![Assigning the OrganizationAdmin role to tf_runner service account](https://github.com/confluentinc/terraform-provider-confluent/raw/master/docs/images/OrganizationAdmin.png)

1) Configure the demo

The whole demo creation is scripted. The script will leverage Terraform to: 1. As mentioned before the application uses a Kafka cluster running in a fully managed service for Apache Kafka. Therefore the first thing it will provision is Confluent Cloud resources using the Confluent Cloud CLI. If you are interested in how you can create a cluster in Confluent Cloud via the Web UI have a look at our Quick Start for Apache Kafka using Confluent Cloud. 2. Spin up the other resources needed in AWS

Configure the Confluent CLI

  1. Log in to the Confluent Cloud CLI:

    confluent login --save

The --save flag will save your Confluent Cloud login credentials to the ~/.netrc file.

Configure the demo

The demo script will do everything for you, but for it to work you need to add some configurations.

  1. Create the demo.cfg file using the example provided in the config folder

    cp config/demo.cfg.example config/demo.cfg
  2. Provide the required information on the 'demo.cfg' file

    export TF_VAR_aws_profile="<AWS_PROFILE>"
    export TF_VAR_aws_region="eu-west-2"
    export TF_VAR_schema_registry_region="eu-central-1"
    export TF_VAR_confluent_cloud_api_key=="<CONFLUENT_CLOUD_API_KEY>"
    export TF_VAR_confluent_cloud_api_secret="<CONFLUENT_CLOUD_API_SECRET>"

    we advice using the utility gimme-aws-creds if you use Okta to login in AWS. You can also use the [granted](https://granted.dev/) CLI for AWS creds. Amend any of the config as you see fit for your preference (Like the aws region or Schema registry Region)

  3. If you are not using gimme-aws-creds, create a credential file as described here. The file in ~/.aws/credentials should look like this (An example below)

    [default]
    aws_access_key_id=AKIAIOSFODNN7EXAMPLE
    aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

    You can set TF_VAR_aws_profile="default" in the demo.cfg file

  4. Notice the optional configuration in the same file. Uncomment and define as you see fit

    # These are optional configs
    # export S3_BUCKET_NAME="ksqldbpacman"

2) Deploying the application

The application is essentially a set of HTML/CSS/JS files that forms a microsite that can be hosted statically anywhere. But for the sake of coolness we will deploy this microsite in a S3 bucket from AWS. This bucket will be created in the same region selected for the Confluent Cloud cluster to ensure that the application will be co-located. The application will emit events that will be processed by a event handler implemented as an API Gateway which uses a Lambda function as backend. This event handler API receives the events and writes them into Kafka using ksqlDB.

pac man arch

Please note that during deployment, the script takes care of creating the required Kafka topics and also the ksqlDB queries. Therefore, there is no need to manually create them.

  1. Start the demo creation

    ./start.sh
  2. At the end of the provisioning the Output with the demo endpoint will be shown. Paste the demo url in your browser and start playing!

    Outputs:
    
    Pacman = http://streaming-pacman00000.s3-website-region.amazonaws.com

If you notice a CORS error in the browser and the scoreboard is not updating, try relaunching.

Note: When you are done with the application, you can automatically destroy all the resources created using the command below:

./stop.sh

3) The scoreboard

The scoreboard can be visualized in real time by clicking on the SCOREBOARD link in the pacman game (top right corner). Who will be the best player?

scoreboard

4) Looking under the hood

When users play with the Pac-Man game two types of events will be generated. One is called User Game and contains the data about the user’s current game such as their score, current level, and the number of lives. The other is called User Losses and as the name implies contains data about whether the user lose in the game. To build a scoreboard out of this a streaming analytics pipeline will be created to transform these raw events into a table with the scoreboard that is updated in near real-time.

pipeline

To implement the pipeline we use ksqlDB. The code for this pipeline has been written for you and it was automatically deployed into a fully managed ksqlDB Server.

the Scoreboard logic

ksqlDB supports Pull queries, where you can get the latest value for a given key. The pacman app uses this feature in order to show you the scoreboard, with a simple trick:

  1. A first request is sent to get the SET of all user_id of the players. This collection of strings is calculated in real-time by ksqlDB continously, using a COLLECT_SET aggregated function, as you can see in the statements.sql). By using a constant as the key for aggregation we are effectively creating an aggregation for all the events in the stream. We can then use this constant string as key in our pull query

    SELECT HIGHEST_SCORE_VALUE, USERS_SET_VALUE FROM SUMMARY_STATS WHERE SUMMARY_KEY='SUMMARY_KEY';
  2. A query to the scoreboard is sent using the list retrieved with the first api call in the IN where clause:

    select USER, HIGHEST_SCORE, HIGHEST_LEVEL, TOTAL_LOSSES from STATS_PER_USER WHERE USER IN (${userListCsv});

Troubleshooting

If you face issues in the pacman app, try open the developer tools of your browser and watch what errors are in the console. If you see a CORS related issue, check your user in AWS IAM, we have seen issues were missing permission would result is this issues. The solution is to add your user to the relevat Groups.

License

This project is licensed under the Apache 2.0 License.

Previous Pacman Demo

Are you looking for the previous version of this demo? You can find it here: https://github.com/confluentinc/demo-scene/releases/tag/pacman-v1.0