Mapping types
Elasticsearch and ClickHouse support a wide variety of data types, but their underlying storage and query models are fundamentally different. This section maps commonly used Elasticsearch field types to their ClickHouse equivalents, where available, and provides context to help guide migrations. Where no equivalent exists, alternatives or notes are provided in the comments.
Elasticsearch Type | ClickHouse Equivalent | Comments |
---|---|---|
boolean | UInt8 or Bool | ClickHouse supports Boolean as an alias for UInt8 in newer versions. |
keyword | String | Used for exact-match filtering, grouping, and sorting. |
text | String | Full-text search is limited in ClickHouse; tokenization requires custom logic using functions such as tokens combined with array functions. |
long | Int64 | 64-bit signed integer. |
integer | Int32 | 32-bit signed integer. |
short | Int16 | 16-bit signed integer. |
byte | Int8 | 8-bit signed integer. |
unsigned_long | UInt64 | Unsigned 64-bit integer. |
double | Float64 | 64-bit floating-point. |
float | Float32 | 32-bit floating-point. |
half_float | Float32 or BFloat16 | Closest equivalent. ClickHouse does not have a 16-bit float. ClickHouse has a BFloat16 - this is different from Half-float IEE-754: half-float offers higher precision with a smaller range, while bfloat16 sacrifices precision for a wider range, making it better suited for machine learning workloads. |
scaled_float | Decimal(x, y) | Store fixed-point numeric values. |
date | DateTime | Equivalent date types with second precision. |
date_nanos | DateTime64 | ClickHouse supports nanosecond precision with DateTime64(9) . |
binary | String , FixedString(N) | Needs base64 decoding for binary fields. |
ip | IPv4 , IPv6 | Native IPv4 and IPv6 types available. |
object | Nested , Map , Tuple , JSON | ClickHouse can model JSON-like objects using Nested or JSON . |
flattened | String | The flattened type in Elasticsearch stores entire JSON objects as single fields, enabling flexible, schemaless access to nested keys without full mapping. In ClickHouse, similar functionality can be achieved using the String type, but requires processing to be done in materialized views. |
nested | Nested | ClickHouse Nested columns provide similar semantics for grouped sub fields assuming users use flatten_nested=0 . |
join | NA | No direct concept of parent-child relationships. Not required in ClickHouse as joins across tables are supported. |
alias | Alias column modifier | Aliases are supported through a field modifier. Functions can be applied to these alias e.g. size String ALIAS formatReadableSize(size_bytes) |
range types (*_range ) | Tuple(start, end) or Array(T) | ClickHouse has no native range type, but numerical and date ranges can be represented using Tuple(start, end) or Array structures. For IP ranges (ip_range ), store CIDR values as String and evaluate with functions like isIPAddressInRange() . Alternatively, consider ip_trie based lookup dictionaries for efficient filtering. |
aggregate_metric_double | AggregateFunction(...) and SimpleAggregateFunction(...) | Use aggregate function states and materialized views to model pre-aggregated metrics. All aggregation functions support aggregate states. |
histogram | Tuple(Array(Float64), Array(UInt64)) | Manually represent buckets and counts using arrays or custom schemas. |
annotated-text | String | No built-in support for entity-aware search or annotations. |
completion , search_as_you_type | NA | No native autocomplete or suggester engine. Can be reproduced with String and search functions. |
semantic_text | NA | No native semantic search - generate embeddings and use vector search. |
token_count | Int32 | Use during ingestion to compute token count manually e.g. length(tokens()) function e.g. with a Materialized column |
dense_vector | Array(Float32) | Use arrays for embedding storage |
sparse_vector | Map(UInt32, Float32) | Simulate sparse vectors with maps. No native sparse vector support. |
rank_feature / rank_features | Float32 , Array(Float32) | No native query-time boosting, but can be modeled manually in scoring logic. |
geo_point | Tuple(Float64, Float64) or Point | Use tuple of (latitude, longitude). Point is available as a ClickHouse type. |
geo_shape , shape | Ring , LineString , MultiLineString , Polygon , MultiPolygon | Native support for geo shapes and spatial indexing. |
percolator | NA | No concept of indexing queries. Use standard SQL + Incremental Materialized Views instead. |
version | String | ClickHouse does not have a native version type. Store versions as strings and use custom UDFs functions to perform semantic comparisons if needed. Consider normalizing to numeric formats if range queries are required. |
Notes
-
Arrays: In Elasticsearch, all fields support arrays natively. In ClickHouse, arrays must be explicitly defined (e.g.,
Array(String)
), with the advantage specific positions can be accessed and queried e.g.an_array[1]
. -
Multi-fields: Elasticsearch allows indexing the same field multiple ways (e.g., both
text
andkeyword
). In ClickHouse, this pattern must be modeled using separate columns or views. -
Map and JSON Types - In ClickHouse, the
Map
type is commonly used to model dynamic key-value structures such asresourceAttributes
andlogAttributes
. This type enables flexible schema-less ingestion by allowing arbitrary keys to be added at runtime — similar in spirit to JSON objects in Elasticsearch. However, there are important limitations to consider:- Uniform value types: ClickHouse
Map
columns must have a consistent value type (e.g.,Map(String, String)
). Mixed-type values are not supported without coercion. - Performance cost: accessing any key in a
Map
requires loading the entire map into memory, which can be suboptimal for performance. - No subcolumns: unlike JSON, keys in a
Map
are not represented as true subcolumns, which limits ClickHouse’s ability to index, compress, and query efficiently.
Because of these limitations, ClickStack is migrating away from
Map
in favor of ClickHouse's enhancedJSON
type. TheJSON
type addresses many of the shortcomings ofMap
:- True columnar storage: each JSON path is stored as a subcolumn, allowing efficient compression, filtering, and vectorized query execution.
- Mixed-type support: different data types (e.g., integers, strings, arrays) can coexist under the same path without coercion or type unification.
- File system scalability: internal limits on dynamic keys (
max_dynamic_paths
) and types (max_dynamic_types
) prevent an explosion of column files on disk, even with high cardinality key sets. - Dense storage: nulls and missing values are stored sparsely to avoid unnecessary overhead.
The
JSON
type is especially well-suited for observability workloads, offering the flexibility of schemaless ingestion with the performance and scalability of native ClickHouse types — making it an ideal replacement forMap
in dynamic attribute fields.For further details on the JSON type we recommend the JSON guide and "How we built a new powerful JSON data type for ClickHouse".
- Uniform value types: ClickHouse