pyspark.sql.functions.to_unix_timestamp#

pyspark.sql.functions.to_unix_timestamp(timestamp, format=None)[source]#

Returns the UNIX timestamp of the given time.

New in version 3.5.0.

Parameters
timestampColumn or column name

Input column or strings.

formatColumn or column name, optional

format to use to convert UNIX timestamp values.

Examples

>>> spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles")

Example 1: Using default format to parse the timestamp string.

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([('2015-04-08 12:12:12',)], ['ts'])
>>> df.select('*', sf.to_unix_timestamp('ts')).show()
+-------------------+------------------------------------------+
|                 ts|to_unix_timestamp(ts, yyyy-MM-dd HH:mm:ss)|
+-------------------+------------------------------------------+
|2015-04-08 12:12:12|                                1428520332|
+-------------------+------------------------------------------+

Example 2: Using user-specified format ‘yyyy-MM-dd’ to parse the date string.

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame([('2015-04-08',)], ['dt'])
>>> df.select('*', sf.to_unix_timestamp(df.dt, sf.lit('yyyy-MM-dd'))).show()
+----------+---------------------------------+
|        dt|to_unix_timestamp(dt, yyyy-MM-dd)|
+----------+---------------------------------+
|2015-04-08|                       1428476400|
+----------+---------------------------------+

Example 3: Using a format column to represent different formats.

>>> import pyspark.sql.functions as sf
>>> df = spark.createDataFrame(
...     [('2015-04-08', 'yyyy-MM-dd'), ('2025+01+09', 'yyyy+MM+dd')], ['dt', 'fmt'])
>>> df.select('*', sf.to_unix_timestamp('dt', 'fmt')).show()
+----------+----------+--------------------------+
|        dt|       fmt|to_unix_timestamp(dt, fmt)|
+----------+----------+--------------------------+
|2015-04-08|yyyy-MM-dd|                1428476400|
|2025+01+09|yyyy+MM+dd|                1736409600|
+----------+----------+--------------------------+
>>> spark.conf.unset("spark.sql.session.timeZone")