**Description**: The logarithm of a specific field with `base` as the radix. If you do not enter a base, the natural logarithm of the field is returned.
**Return value type**: If the concatenated strings are VARCHARs, the result is a VARCHAR. If the concatenated strings are NCHARs, the result is an NCHAR. If an input value is null, the result is null.
**Return value type**: If the concatenated strings are VARCHARs, the result is a VARCHAR. If the concatenated strings are NCHARs, the result is an NCHAR. If an input value is null, the result is null.
**Description**: The sub-string starting from `pos` with length of `len` from the original string `str` - If `len` is not specified, it means from `pos` to the end.
**Applicable data types**: VARCHAR and NCHAR Parameter `pos` can be an positive or negative integer; If it's positive, the starting position will be counted from the beginning of the string; if it's negative, the starting position will be counted from the end of the string.
1. Some strings cannot be converted to numeric values. For example, the string `a` may be converted to `0`. However, this does not produce an error.
2. If a converted numeric value is larger than the maximum size for the specified type, an overflow will occur. However, this does not produce an error.
3. If a converted string value is larger than the maximum size for the specified type, the output value will be truncated. However, this does not produce an error.
**Description**: The ISO8601 date/time format converted from a UNIX timestamp, plus the timezone. You can specify any time zone with the timezone parameter. If you do not enter this parameter, the time zone on the client is used.
- The precision of the input timestamp will be recognized automatically according to the precision of the table used, milliseconds will be used if no table is specified.
**Applicable data types**: JSON strings in the form `{"literal": literal}`. `{}` indicates a null value. The key must be a string literal. The value can be a numeric literal, string literal, Boolean literal, or null literal. str_literal cannot include escape characters.
- return_timestamp indicates whether the returned value type is TIMESTAMP or not. If this parameter set to 1, function will return TIMESTAMP type. Otherwise function will return BIGINT type. If parameter is omitted, default return value type is BIGINT.
- The output format of `Month`, `Day` are left aligined, like`2023-OCTOBER -01`, `2023-SEPTEMBER-01`, `September` is the longest, no paddings. Week days are slimilar.
- When `ms`,`us`,`ns` are used in `to_char`, like `to_char(ts, 'yyyy-mm-dd hh:mi:ss.ms.us.ns')`, The time of `ms`,`us`,`ns` corresponds to the same fraction seconds. When ts is `1697182085123`, the output of `ms` is `123`, `us` is `123000`, `ns` is `123000000`.
- If we want to output some characters of format without converting, surround it with double quotes. `to_char(ts, 'yyyy-mm-dd "is formated by yyyy-mm-dd"')`. If want to output double quotes, add a back slash before double quote, like `to_char(ts, '\"yyyy-mm-dd\"')` will output `"2023-10-10"`.
- For formats that output digits, the uppercase and lowercase formats are the same.
- The precision of the input timestamp will be recognized automatically according to the precision of the table used, milliseconds will be used if no table is specified.
**Nested query**: It can be used in both the outer query and inner query in a nested query.
**Applicable table types**: standard tables and supertables
**Supported Formats**: The same as `TO_CHAR`.
**More explanations**:
- When `ms`, `us`, `ns` are used in `to_timestamp`, if multi of them are specified, the results are accumulated. For example, `to_timestamp('2023-10-10 10:10:10.123.000456.000000789', 'yyyy-mm-dd hh:mi:ss.ms.us.ns')` will output the timestamp of `2023-10-10 10:10:10.123456789`.
- The uppercase or lowercase of `MONTH`, `MON`, `DAY`, `DY` and formtas that output digits have same effect when used in `to_timestamp`, like `to_timestamp('2023-JANUARY-01', 'YYYY-month-dd')`, `month` can be replaced by `MONTH`, or `month`. The cases are ignored.
- If multi times are specified for one component, the previous will be overwritten. Like `to_timestamp('2023-22-10-10', 'yyyy-yy-MM-dd')`, the output year will be `2022`.
- To avoid unexpected time zone used during the convertion, it's recommended to put time zone in the ts string, e.g. '2023-10-10 10:10:10+08'. If time zone not specified, default will be that in server or client.
- The default timestamp if some components are not specified will be: `1970-01-01 00:00:00` with the timezone specified or default to local timezone. Only `DDD` is specified without `DD` is not supported currently, e.g. format 'yyyy-mm-ddd' is not supported, but 'yyyy-mm-dd' is supported.
- If `AM` or `PM` is specified in formats, the Hour must between `1-12`.
- In some cases, `to_timestamp` can convert correctly even the format and the timestamp string are not totally matched. Like `to_timetamp('200101/2', 'yyyyMM1/dd')`, the digit `1` in format string are ignored, and the output timestsamp is `2001-01-02 00:00:00`. Spaces and tabs in formats and tiemstamp string are also ignored automatically.
- The precision of the output timestamp will be the same as the table in SELECT stmt, millisecond will be used if no table is specified. The output of `select to_timestamp('2023-08-1 10:10:10.123456789', 'yyyy-mm-dd hh:mi:ss.ns')` will be truncated to millisecond precision. If a nano precision table is specified, no truncation will be applied. Like `select to_timestamp('2023-08-1 10:10:10.123456789', 'yyyy-mm-dd hh:mi:ss.ns') from db_ns.table_ns limit 1`.
All functions that return the current time, such as `NOW`, `TODAY`, and `TIMEZONE`, are calculated only once per statement even if they appear multiple times.
- The precision of the input timestamp will be recognized automatically according to the precision of the table used, milliseconds will be used if no table is specified.
- When using 1d/1w as the time unit to truncate timestamp, you can specify whether to truncate based on the current time zone by setting the use_current_timezone parameter.
Value 0 indicates truncation using the UTC time zone, value 1 indicates truncation using the current time zone.
For example, if the time zone configured by the Client is UTC + 0800, TIMETRUNCATE ('2020-01-01 23:00:00', 1d, 0) returns the result of '2020-01-01 08:00:00'.
When using TIMETRUNCATE ('2020-01-01 23:00:00', 1d, 1), the result is 2020-01-01 00:00:00 '.
When use_current_timezone is not specified, use_current_timezone defaults to 1.
-`algo_type` can only be input as `default` or `t-digest` Enter `default` to use a histogram-based algorithm. Enter `t-digest` to use the t-digest algorithm to calculate the approximation of the quantile. `default` is used by default.
- The approximation result of `t-digest` algorithm is sensitive to input data order. For example, when querying STable with different input data order there might be minor differences in calculated results.
**Description**: `elapsed` function can be used to calculate the continuous time length in which there is valid data. If it's used with `INTERVAL` clause, the returned result is the calculated time length within each time window. If it's used without `INTERVAL` clause, the returned result is the calculated time length within the specified time range. Please be noted that the return value of `elapsed` is the number of `time_unit` in the calculated time length.
- The minimum value of `time_unit` is the time precision of the database. If `time_unit` is not specified, the time precision of the database is used as the default time unit. Time unit specified by `time_unit` can be:
- It can be used with `INTERVAL` to get the time valid time length of each time window. Please be noted that the return value is same as the time window for all time windows except for the first and the last time window.
-`order by asc/desc` has no effect on the result.
-`group by tbname` must be used together when `elapsed` is used against a STable.
-`group by` must NOT be used together when `elapsed` is used against a table or sub table.
- When used in nested query, it's only applicable when the inner query outputs an implicit timestamp column as the primary key. For example, `select elapsed(ts) from (select diff(value) from sub1)` is legal usage while `select elapsed(ts) from (select * from sub1)` is not. In addition, because elapsed has a strict dependency on the timeline, a statement like `select elapsed(ts) from (select diff(value) from st group by tbname) will return a meaningless result.
- It can't be used with `leastsquares`, `diff`, `derivative`, `top`, `bottom`, `last_row`, `interp`.
The cardinal number of a specific column is returned by using hyperloglog algorithm. The benefit of using hyperloglog algorithm is that the memory usage is under control when the data volume is huge.
However, when the data volume is very small, the result may be not accurate, it's recommended to use `select count(data) from (select unique(col) as data from table)` in this case.
"start" - bin starting point. "width" - bin offset. "count" - number of bins generated. "infinity" - whether to add (-inf, inf) as start/end point in generated set of bins.
"start" - bin starting point. "factor" - exponential factor of bin offset. "count" - number of bins generated. "infinity" - whether to add (-inf, inf) as start/end point in generated range of bins.
**Description**: The value whose rank in a specific column matches the specified percentage. If such a value matching the specified percentage doesn't exist in the column, an interpolation value will be returned.
**Return value type**: This function takes 2 minimum and 11 maximum parameters, and it can simultaneously return 10 percentiles at most. If 2 parameters are given, a single percentile is returned and the value type is DOUBLE.
- When calculating multiple percentiles of a specific column, a single PERCENTILE function with multiple parameters is advised, as this can largely reduce the query response time.
For example, using SELECT percentile(col, 90, 95, 99) FROM table will perform better than SELECT percentile(col, 90), percentile(col, 95), percentile(col, 99) from table.
Selection functions return one or more results depending. You can specify the timestamp column, tbname pseudocolumn, or tag columns to show which rows contain the selected value.
**Description**: The least _k_ values of a specific column in a table or STable. If a value has multiple occurrences in the column but counting all of them in will exceed the upper limit _k_, then a part of them will be returned randomly.
- FIRST(\*) can be used to get the first non-null value of all columns; When querying a super table and multiResultFunctionStarReturnTags is set to 0 (default), FIRST(\*) only returns columns of super table; When set to 1, returns columns and tags of the super table.
**Description**: The value that matches the specified timestamp range is returned, if existing; or an interpolation value is returned. The value of `ignore_null_values` can be 0 or 1, 1 means null values are ignored. The default value of this parameter is 0.
-`INTERP` is used to get the value that matches the specified time slice from a column. If no such value exists an interpolation value will be returned based on `FILL` parameter.
- The input data of `INTERP` is the value of the specified column and a `where` clause can be used to filter the original data. If no `where` condition is specified then all original data is the input.
- The output time range of `INTERP` is specified by `RANGE(timestamp1,timestamp2)` parameter, with timestamp1 <= timestamp2. timestamp1 is the starting point of the output time range. timestamp2 is the ending point of the output time range.
- The number of rows in the result set of `INTERP` is determined by the parameter `EVERY(time_unit)`. Starting from timestamp1, one interpolation is performed for every time interval specified `time_unit` parameter. The parameter `time_unit` must be an integer, with no quotes, with a time unit of: a(millisecond)), s(second), m(minute), h(hour), d(day), or w(week). For example, `EVERY(500a)` will interpolate every 500 milliseconds.
- When only one timestamp value is specified in `RANGE` clause, `INTERP` is used to generate interpolation at this point in time. In this case, `EVERY` clause can be omitted. For example, SELECT INTERP(col) FROM tb RANGE('2023-01-01 00:00:00') FILL(linear).
-`INTERP` can be applied to supertable by interpolating primary key sorted data of all its childtables. It can also be used with `partition by tbname` when applied to supertable to generate interpolation on each single timeline.
- Pseudocolumn `_irowts` can be used along with `INTERP` to return the timestamps associated with interpolation points(support after version 3.0.2.0).
- Pseudocolumn `_isfilled` can be used along with `INTERP` to indicate whether the results are original records or data points generated by interpolation algorithm(support after version 3.0.3.0).
- We use the smart meters example used in this documentation to illustrate how to use the INTERP function.
- We want to downsample every 1 hour and use a linear fill for missing values. Note the order in which the "partition by" clause and the "range", "every" and "fill" parameters are used.
- LAST(\*) can be used to get the last non-NULL value of all columns; When querying a super table and multiResultFunctionStarReturnTags is set to 0 (default), LAST(\*) only returns columns of super table; When set to 1, returns columns and tags of the super table.
- If the values of a column in the result set are all NULL, NULL is returned for that column; if all columns in the result are all NULL, no result will be returned.
- When it's used on a STable, if there are multiple values with the timestamp in the result set, one of them will be returned randomly and it's not guaranteed that the same value is returned if the same query is run multiple times.
- LAST_ROW(\*) can be used to get the last value of all columns; When querying a super table and multiResultFunctionStarReturnTags is set to 0 (default), LAST_ROW(\*) only returns columns of super table; When set to 1, returns columns and tags of the super table.
- When it's used on a STable, if there are multiple values with the timestamp in the result set, one of them will be returned randomly and it's not guaranteed that the same value is returned if the same query is run multiple times.
**Description**:The value which has the highest frequency of occurrence. One random value is returned if there are multiple values which have highest frequency of occurrence.
**Description**: The next _k_ rows are returned after skipping the last `offset_val` rows, NULL values are not ignored. `offset_val` is optional parameter. When it's not specified, the last _k_ rows are returned. When `offset_val` is used, the effect is same as `order by ts desc LIMIT k OFFSET offset_val`.
**Description**: The greatest _k_ values of a specific column in a table or STable. If a value has multiple occurrences in the column but counting all of them in will exceed the upper limit _k_, then a part of them will be returned randomly.
**Description**: The values that occur the first time in the specified column. The effect is similar to `distinct` keyword. For a table with composite primary key, only the data with the smallest primary key value is returned.
TDengine includes extensions to standard SQL that are intended specifically for time-series use cases. The functions enabled by these extensions require complex queries to implement in general-purpose databases. By offering them as built-in extensions, TDengine reduces user workload.
**Description**: The derivative of a specific column. The time rage can be specified by parameter `time_interval`, the minimum allowed time range is 1 second (1s); the value of `ignore_negative` can be 0 or 1, 1 means negative values are ignored. For tables with composite primary key, the data with the smallest primary key value is used to calculate the derivative.
**Description**: The difference of each row with its previous row for a specific column. `ignore_option` takes the value of 0|1|2|3, the default value is 0 if it's not specified.
**Return value type**: `bool`, `timestamp` and `integer` value type all return `int_64`, `float` type returns `double`; if the diff result overflows, it is returned as overflow.
- diff is to calculate the difference of a specific column in current row and the **first valid data before the row**. The **first valid data before the row** refers to the most adjacent non-null value of same column with smaller timestamp.
- The diff result of numeric type is the corresponding arithmatic difference; the timestamp is calculated based on the timestamp precision of the database; when calculating diff, `true` is treated as 1 and `false` is treated as 0
- If the data of current row is NULL or can't find the **first valid data before the current row**, the diff result is NULL
- When ignoring negative values (ignore_option is set to 1 or 3), if the diff result is negative, the result is set to null, and then filtered according to the null value filtering rule
- When the diff result has an overflow, whether to ignore the negative value depends on the result of the logical operation is positive or negative. For example, the value of 9223372036854775800 - (-9223372036854775806) exceeds the range of BIGINT, and the diff result will display the overflow value -10, but it will not be ignored as a negative value
- Single or multiple diffs can be used in a single statement, and for each diff you can specify same or different `ignore_option`. When there are multiple diffs in a single statement, when and only when all the diff results are NULL for a row and each diff's `ignore_option` is specified as ignoring NULL, the output of this row will be removed from the result set.
- Can be used with the selected associated columns. For example: `select _rowts, DIFF()`.
- When there is not composite primary key, if there are the same timestamps across different subtables, it will prompt "Duplicate timestamps not allowed"
- When using with composite primary key, there may be same combination of timestamp and complete primary key across sub-tables, which row will be used depends on which row is found first, that means the result of running diff() multiple times may be different in such a case
**Description**: instantaneous rate on a specific column. The last two samples in the specified time range are used to calculate instantaneous rate. If the last sample value is smaller, then only the last sample value is used instead of the difference between the last two sample values. For tables with composite primary key, the data with the smallest primary key value is used to calculate the rate.
**Description**: The moving average of continuous _k_ values of a specific column. If the number of input rows is less than _k_, nothing is returned. The applicable range of _k_ is [1,1000].
**Description**: The number of continuous rows satisfying the specified conditions for a specific column. The result is shown as an extra column for each row. If the specified condition is evaluated as true, the number is increased by 1; otherwise the number is reset to -1. If the input value is NULL, then the corresponding row is skipped.
- oper : Can be one of `'LT'` (lower than), `'GT'` (greater than), `'LE'` (lower than or equal to), `'GE'` (greater than or equal to), `'NE'` (not equal to), `'EQ'` (equal to), the value is case insensitive, the value must be in quotes.
**Description**: The length of time range in which all rows satisfy the specified condition for a specific column. The result is shown as an extra column for each row. The length for the first row that satisfies the condition is 0. Next, if the condition is evaluated as true for a row, the time interval between current row and its previous row is added up to the time range; otherwise the time range length is reset to -1. If the value of the column is NULL, the corresponding row is skipped.
- oper : Can be one of `'LT'` (lower than), `'GT'` (greater than), `'LE'` (lower than or equal to), `'GE'` (greater than or equal to), `'NE'` (not equal to), `'EQ'` (equal to), the value is case insensitive, the value must be in quotes.
- unit: The unit of time interval. Enter one of the following options: 1b (nanoseconds), 1u (microseconds), 1a (milliseconds), 1s (seconds), 1m (minutes), 1h (hours), 1d (days), or 1w (weeks) If you do not enter a unit of time, the precision of the current database is used by default.
**Description**: Time weighted average on a specific column within a time range. For tables with composite primary key, the data with the smallest primary key value is used to calculate the average.
**Description**: The current database. If no database is specified upon logging in and no database is specified with `USE` after login, NULL will be returned by `select database()`.
Geometry input functions create geometry data from WTK.
#### ST_GeomFromText
```sql
ST_GeomFromText(VARCHAR WKT expr)
```
**Description**: Return a specified GEOMETRY value from Well-Known Text representation (WKT).
**Return value type**: GEOMETRY
**Applicable data types**: VARCHAR
**Applicable table types**: standard tables and supertables
**Explanations**:
- The input can be one of WTK string, like POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, GEOMETRYCOLLECTION.
- The output is a GEOMETRY data type, internal defined as binary string.
### Geometry Output Functions
Geometry output functions convert geometry data into WTK.
#### ST_AsText
```sql
ST_AsText(GEOMETRY geom)
```
**Description**: Return a specified Well-Known Text representation (WKT) value from GEOMETRY data.
**Return value type**: VARCHAR
**Applicable data types**: GEOMETRY
**Applicable table types**: standard tables and supertables
**Explanations**:
- The output can be one of WTK string, like POINT, LINESTRING, POLYGON, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, GEOMETRYCOLLECTION.
### Geometry Relationships Functions
Geometry relationships functions determine spatial relationships between geometries.
#### ST_Intersects
```sql
ST_Intersects(GEOMETRY geomA, GEOMETRY geomB)
```
**Description**: Compares two geometries and returns true if they intersect.
**Return value type**: BOOL
**Applicable data types**: GEOMETRY, GEOMETRY
**Applicable table types**: standard tables and supertables
**Explanations**:
- Geometries intersect if they have any point in common.
#### ST_Equals
```sql
ST_Equals(GEOMETRY geomA, GEOMETRY geomB)
```
**Description**: Returns TRUE if the given geometries are "spatially equal".
**Return value type**: BOOL
**Applicable data types**: GEOMETRY, GEOMETRY
**Applicable table types**: standard tables and supertables
**Explanations**:
- 'Spatially equal' means ST_Contains(A,B) = true and ST_Contains(B,A) = true, and the ordering of points can be different but represent the same geometry structure.
#### ST_Touches
```sql
ST_Touches(GEOMETRY geomA, GEOMETRY geomB)
```
**Description**: Returns TRUE if A and B intersect, but their interiors do not intersect.
**Return value type**: BOOL
**Applicable data types**: GEOMETRY, GEOMETRY
**Applicable table types**: standard tables and supertables
**Explanations**:
- A and B have at least one point in common, and the common points lie in at least one boundary.
- For Point/Point inputs the relationship is always FALSE, since points do not have a boundary.
#### ST_Covers
```sql
ST_Covers(GEOMETRY geomA, GEOMETRY geomB)
```
**Description**: Returns TRUE if every point in Geometry B lies inside (intersects the interior or boundary of) Geometry A.
**Return value type**: BOOL
**Applicable data types**: GEOMETRY, GEOMETRY
**Applicable table types**: standard tables and supertables
**Explanations**:
- A covers B means no point of B lies outside (in the exterior of) A.
#### ST_Contains
```sql
ST_Contains(GEOMETRY geomA, GEOMETRY geomB)
```
**Description**: Returns TRUE if geometry A contains geometry B.
**Return value type**: BOOL
**Applicable data types**: GEOMETRY, GEOMETRY
**Applicable table types**: standard tables and supertables
**Explanations**:
- A contains B if and only if all points of B lie inside (i.e. in the interior or boundary of) A (or equivalently, no points of B lie in the exterior of A), and the interiors of A and B have at least one point in common.