Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusions about the timestamp #620

Open
eiphy opened this issue Mar 10, 2024 · 7 comments
Open

Confusions about the timestamp #620

eiphy opened this issue Mar 10, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@eiphy
Copy link

eiphy commented Mar 10, 2024

Describe the bug
The timestamp returned from polygon does not make sense to me. After converting to normal datetime object, it spans from 0am to 8am and then 17pm to 23pm. However, the core trading time of NASDAQ should be 1430pm to 21pm in UTC. And the volume values are quite large in intertrading time.

To Reproduce
Use this python script to reproduce:

import datetime

import pandas as pd
from polygon import RESTClient

client = RESTClient(api_key=# KEY)
aggs = client.get_aggs("AAPL", 1, "hour", from_="2023-05-01", to="2023-06-01", limit=50000)
data = []
for agg in aggs:
    data.append(
        {
            "datetime": datetime.datetime.fromtimestamp(agg.timestamp / 1000),
            "close": agg.close,
            "volumne": agg.volume,
        }
    )
df = pd.DataFrame(data)
df = df.set_index("datetime")
df = df.sort_index()
print(df.iloc[0:60])

Expected behavior
Data should be within the core trading time, i.e., 1430 to 2100

Screenshots
An example:
image

Additional context
N.A.

@eiphy eiphy added the bug Something isn't working label Mar 10, 2024
@eiphy eiphy closed this as completed Mar 13, 2024
@justinpolygon
Copy link
Contributor

Hey @eiphy , I was checking this out and just wanted to make sure you resolved it on your end? Was it a timezone issue?

@eiphy
Copy link
Author

eiphy commented Mar 13, 2024

Hi @justinpolygon , yes, it's a timezone issue.

@ywave620
Copy link

                 Open  High   Low  Close  Volume    Vwap  Transactions   Otc

Timestamp
2022-06-15 13:30:00 6.15 6.86 6.15 6.23 50 6.5202 11 None
2022-06-15 13:31:00 6.11 6.27 6.01 6.27 14 6.1207 5 None
2022-06-15 13:32:00 6.26 6.50 6.25 6.50 423 6.4920 23 None
2022-06-15 13:33:00 6.55 6.65 6.39 6.39 238 6.6324 12 None
2022-06-15 13:34:00 6.26 6.37 6.26 6.37 61 6.3154 6 None
... ... ... ... ... ... ... ... ...
2022-06-15 19:50:00 8.86 8.86 7.64 7.64 11 8.1418 3 None
2022-06-15 19:53:00 7.48 7.51 7.48 7.51 3 7.4900 2 None
2022-06-15 19:55:00 7.99 7.99 7.99 7.99 1 7.9900 1 None
2022-06-15 19:57:00 7.79 7.79 7.79 7.79 15 7.7900 1 None
2022-06-15 19:58:00 7.78 7.78 7.76 7.76 20 7.7610 2 None

Looks like it still has problem, I'm looking at option data, and I think there should not be data before 1430pm in UTC

@ywave620
Copy link

@eiphy

@justinpolygon
Copy link
Contributor

Hi @ywave620, can you give more detail about what you're looking at here? What is the query you are using so that I can try and reproduce?

@justinpolygon justinpolygon reopened this May 28, 2024
@ywave620
Copy link

from polygon import RESTClient
import pickle

client = RESTClient(api_key="xx")
  aggs = []
  for a in client.list_aggs(ticker=ticker, multiplier=1, timespan="minute",
                            from_="2022-06-15", to="2022-06-15", limit=50000):
    aggs.append(a)
  print(aggs)
  with open('option-paid-data/' + ticker, 'wb') as f:
    pickle.dump(aggs, f)
    
def read_pickle_as_dataframe(file_name):
    # Load the data from the pickle file
    with open('option-paid-data/' + file_name, 'rb') as f:
        data = pickle.load(f)

    # Convert the list of objects to a list of dictionaries
    data_dicts = [vars(obj) for obj in data]

    # Convert the list of dictionaries to a DataFrame
    df = pd.DataFrame(data_dicts)


    # Capitalize the first letter of each column name
    df.columns = [col.capitalize() for col in df.columns]

    df['Timestamp'] = df['Timestamp'] + 1000*3600

    # Convert the Unix timestamp to a datetime object
    df['Timestamp'] = pd.to_datetime(df['Timestamp'], unit='ms')

    # Set the 'Timestamp' column as the index
    df.set_index('Timestamp', inplace=True)

    start_time = pd.to_datetime('2022-06-15 9:30:00')
    end_time = pd.to_datetime('2022-06-15 21:00:00')

    filtered_df = df[(df.index >= start_time) & (df.index <= end_time)]

    return filtered_df

option_data = read_pickle_as_dataframe('O:QQQ220615C00280000')

@ywave620
Copy link

Please note that I added "df['Timestamp'] = df['Timestamp'] + 1000*3600" to workaround this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants