PHM Data Challenge 2024: PDF and JSON¶

Using a PDF

Let's look at a normal distribution, and assume you have made a deterministic value of 10

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

Set y prediction (torque margin) to 10

In [2]:
my_pred = 10

We can create a normal distribution centered around 10

In [3]:
x = np.arange(0,20,0.01)
y = norm.pdf(x, loc=my_pred)

plt.plot(x,y)
plt.show()

Let's look at the score at 10

In [4]:
true_answer = 10
score = norm.pdf(true_answer, loc=my_pred)
print('score: {}'.format(score))

plt.plot(x,y, label='PDF')
plt.scatter(true_answer, score, label="Prediction and score", color='red')
plt.legend()
plt.show()
score: 0.3989422804014327

What if the true answer is 9?

In [5]:
true_answer = 9
score = norm.pdf(true_answer, loc=my_pred)
print('score: {}'.format(score))

plt.plot(x,y, label='PDF')
plt.scatter(true_answer, score, label="Prediction and score", color='red')
plt.legend()
plt.show()
score: 0.24197072451914337

Let's change to increase our score

In [6]:
true_answer = 10
scale = .4
x = np.arange(0,20,0.01)
y = norm.pdf(x, loc=my_pred, scale=scale)
score = norm.pdf(true_answer, loc=my_pred, scale=scale)
print('score: {}'.format(score))

plt.plot(x,y, label='PDF')
plt.scatter(true_answer, score, label="Prediction and score", color='red')
plt.legend()
plt.show()
score: 0.9973557010035817
In [7]:
true_answer = 9
scale = .4
x = np.arange(0,20,0.01)
y = norm.pdf(x, loc=my_pred, scale=scale)
score = norm.pdf(true_answer, loc=my_pred, scale=scale)
print('score: {}'.format(score))

plt.plot(x,y, label='PDF')
plt.scatter(true_answer, score, label="Prediction and score", color='red')
plt.legend()
plt.show()
score: 0.043820751233921346

How to create a simple JSON submission file

Example classification and regression preditions

In [8]:
sample_ids = [0, 1, 2, 3, 4]
class_preds = [0, 1, 1, 0, 0]
class_confs = [0.5, 0.9, 0.75, 0.1, 0.99]
reg_preds = [3.3, -15.3, 6.7, 0.22, -8.0005]

Iterate through each prediction and make a Python dicationary

In [9]:
submission_dic = {}

for idx, sample_id in enumerate(sample_ids):
    class_pred = class_preds[idx]
    class_conf = class_confs[idx]
    reg_pred = reg_preds[idx]
    
    submission_dic[int(sample_id)] = {"class":int(class_pred),
                    "class_conf":class_conf,
                    "pdf_type":"norm",
                    "pdf_args":{
                        "loc":reg_pred,
                        'scale':1
                    }}

submission_dic
Out[9]:
{0: {'class': 0,
  'class_conf': 0.5,
  'pdf_type': 'norm',
  'pdf_args': {'loc': 3.3, 'scale': 1}},
 1: {'class': 1,
  'class_conf': 0.9,
  'pdf_type': 'norm',
  'pdf_args': {'loc': -15.3, 'scale': 1}},
 2: {'class': 1,
  'class_conf': 0.75,
  'pdf_type': 'norm',
  'pdf_args': {'loc': 6.7, 'scale': 1}},
 3: {'class': 0,
  'class_conf': 0.1,
  'pdf_type': 'norm',
  'pdf_args': {'loc': 0.22, 'scale': 1}},
 4: {'class': 0,
  'class_conf': 0.99,
  'pdf_type': 'norm',
  'pdf_args': {'loc': -8.0005, 'scale': 1}}}

Save it as a JSON object

In [10]:
import json

with open('submission.jso', 'w') as json_file:
    json.dump(submission_dic, json_file, indent=4)
In [ ]: