3: Answering questions

Objective:

In this lab, you will build a machine learning job and use its result to ask more questions to answer.

  1. Create a population job on the elasticlogs index satisfying the following:

    • the job is on the whole elasticlogs dataset
    • the population is grouped by the request field
    • the metric is Count
    • the bucket span is 1h.
    Show answer

    From the main menu go to Machine Learning.
    Under Anomaly Detection, select Jobs and Create job. Use the elasticlogs data view and select the Population job.

    "Population job"

    Click Use full elasticlogs data. Click Next.

    "Use full elasticlogs data"

    For Population field, select request.
    The metric is Count(Event rate), and set the Bucket span to 1h. Click Next.

    "Population job settings"

    Set the Job ID to something descriptive you can understand, such as hourly-population-request. Click Next.

    Make sure your job passes the checks. Click Next. Then Create job. It shouldn't take very long to finish.

    "Start job"

    Click View results. This will take you to the Anomaly Explorer.

  2. Take a look at the results of the population job.

    "View results"

    The highest severity anomalies are found in /blog/welcome-insight-io-to-the-elastic-team. No surprise there! The anomalies were so severe we had already investigated them. Leave an annotation for the webteam and cyberteam about what we found.

    Show answer

    Switch to the Single Metric Viewer to take a look at the results for the request /blog/welcome-insight-io-to-the-elastic-team.

    "Single Metric Viewer"

    Expand the time slider to view the whole time range. Then click and drag across the whole date histogram. Leave a message to the webteam and cyberteam about our findings.

    "Add annotation"

  3. Go back to the Anomaly Explorer. Take a look at the request /blog/kibana-10-common-questions-formulas-time-series-maps. What's happening here? What did the population job find that was not obvious to us before?

    Show answer

    Click and drag over the boxes for the row for the request /blog/kibana-10-common-questions-formulas-time-series-maps.

    "Select row"

    Scroll down and you'll see a graph along with a list of the anomalies found for this request.

    "See graph"

    You can click on the top right corner on View to see it close up in the Single Metric Viewer.

    "View it close up"

    It looks like there were no visitors to this blog page. Then suddenly on August 18, 2021, there was a huge spike, followed by another spike on August 19. Then the visits taper off.

    Why do we see this behavior?

  4. Keep digging around. Use different tools. Create more visualizations in your dashboards. Use Discover. Try to figure out what's causing this strange behavior. Resist the urge to click Show answer until after you've tried to find the solution yourself.

    Show answer

    The dataset is rigged! Every document in this dataset is a visit to a blog page that was published in 2018 – except for one blog, which was published on August 18, 2021 entitled 10 common questions answered with formulas and time travel in Kibana or the request /blog/kibana-10-common-questions-formulas-time-series-maps.

    To see this, go to Discover. Click New to start a new search. Select the fields request, blog_category, blog_publish_date. You can also save this search and add it to your Elastic logs dashboard.

    "Blog table"

Summary:

In this lab, you created a population job and explored its results. Now you're ready to build your very own custom dashboard for your data!