F. Abbreviation

Codeforces
IDCF1003F
Time1000ms
Memory256MB
Difficulty
dphashingstrings
English · Original
Chinese · Translation
Formal · Original
You are given a text consisting of $n$ space-separated words. There is exactly one space character between any pair of adjacent words. There are no spaces before the first word and no spaces after the last word. The length of text is the number of letters and spaces in it. $w_i$ is the $i$\-th word of text. All words consist only of lowercase Latin letters. Let's denote a segment of words $w[i..j]$ as a sequence of words $w_i, w_{i + 1}, \dots, w_j$. Two segments of words $w[i_1 .. j_1]$ and $w[i_2 .. j_2]$ are considered **equal** if $j_1 - i_1 = j_2 - i_2$, $j_1 \ge i_1$, $j_2 \ge i_2$, and for every $t \in [0, j_1 - i_1]$ $w_{i_1 + t} = w_{i_2 + t}$. For example, for the text "_to be or not to be_" the segments $w[1..2]$ and $w[5..6]$ are equal, they correspond to the words "_to be_". An abbreviation is a replacement of some segments of words with their first **uppercase** letters. In order to perform an abbreviation, you have to choose **at least two** non-intersecting equal segments of words, and replace each chosen segment with the string consisting of first letters of the words in the segment (written in uppercase). For example, for the text "_a ab a a b ab a a b c_" you can replace segments of words $w[2..4]$ and $w[6..8]$ with an abbreviation "_AAA_" and obtain the text "_a AAA b AAA b c_", or you can replace segments of words $w[2..5]$ and $w[6..9]$ with an abbreviation "_AAAB_" and obtain the text "_a AAAB AAAB c_". What is the minimum length of the text after at most one abbreviation? ## Input The first line of the input contains one integer $n$ ($1 \le n \le 300$) — the number of words in the text. The next line contains $n$ space-separated words of the text $w_1, w_2, \dots, w_n$. Each word consists only of lowercase Latin letters. It is guaranteed that the length of text does not exceed $10^5$. ## Output Print one integer — the minimum length of the text after at most one abbreviation. [samples] ## Note In the first example you can obtain the text "_TB or not TB_". In the second example you can obtain the text "_a AAAB AAAB c_". In the third example you can obtain the text "_AB aa AB bb_".
你被给定一个由 $n$ 个空格分隔的单词组成的文本。任意两个相邻单词之间恰好有一个空格字符。第一个单词前没有空格,最后一个单词后也没有空格。文本的长度是其中字母和空格的总数。$w_i$ 表示文本的第 $i$ 个单词。所有单词仅由小写拉丁字母组成。 我们用 $w [ i.. j ]$ 表示由单词 $w_i, w_{i + 1}, \dots,h, w_j$ 组成的单词段。两个单词段 $w [ i_1.. j_1 ]$ 和 $w [ i_2.. j_2 ]$ 被认为是 *相等* 的,当且仅当 $j_1 - i_1 = j_2 - i_2$,$j_1 \geq i_1$,$j_2 \geq i_2$,且对于每个 $t \in [ 0, j_1 - i_1 ]$,都有 $w_{i_1 + t} = w_{i_2 + t}$。例如,对于文本 "_to be or not to be_",段 $w [ 1.. 2 ]$ 和 $w [ 5.. 6 ]$ 是相等的,它们都对应单词 "_to be_"。 缩写是指将某些单词段替换为其每个单词的首字母(大写形式)。要执行缩写,你必须选择 *至少两个* 不相交的相等单词段,并将每个选中的段替换为由该段中单词首字母(大写)组成的字符串。例如,对于文本 "_a ab a a b ab a a b c_",你可以将单词段 $w [ 2.. 4 ]$ 和 $w [ 6.. 8 ]$ 替换为缩写 "_AAA_",得到文本 "_a AAA b AAA b c_";或者将单词段 $w [ 2.. 5 ]$ 和 $w [ 6.. 9 ]$ 替换为缩写 "_AAAB_",得到文本 "_a AAAB AAAB c_"。 求在至多进行一次缩写后,文本的最小长度是多少? 输入的第一行包含一个整数 $n$ ($1 \leq n \leq 300$) —— 文本中单词的数量。 下一行包含 $n$ 个空格分隔的文本单词 $w_1, w_2, \dots,h, w_n$。每个单词仅由小写拉丁字母组成。 保证文本长度不超过 $10^5$。 输出一个整数 —— 至多进行一次缩写后文本的最小长度。 在第一个例子中,你可以得到文本 "_TB or not TB_"。 在第二个例子中,你可以得到文本 "_a AAAB AAAB c_"。 在第三个例子中,你可以得到文本 "_AB aa AB bb_"。 ## Input 输入的第一行包含一个整数 $n$ ($1 \leq n \leq 300$) —— 文本中单词的数量。下一行包含 $n$ 个空格分隔的文本单词 $w_1, w_2, \dots,h, w_n$。每个单词仅由小写拉丁字母组成。保证文本长度不超过 $10^5$。 ## Output 输出一个整数 —— 至多进行一次缩写后文本的最小长度。 [samples] ## Note 在第一个例子中,你可以得到文本 "_TB or not TB_"。在第二个例子中,你可以得到文本 "_a AAAB AAAB c_"。在第三个例子中,你可以得到文本 "_AB aa AB bb_"。
**Definitions** Let $ n \in \mathbb{Z} $ be the number of words. Let $ W = (w_1, w_2, \dots, w_n) $ be the sequence of words, where each $ w_i \in \Sigma^* $, $ \Sigma = \{a, b, \dots, z\} $. Let $ L(w_i) $ denote the length of word $ w_i $. The original text length is $ \sum_{i=1}^n L(w_i) + (n - 1) $ (sum of word lengths plus $ n-1 $ spaces). A *segment* $ W[i..j] $ is the subsequence $ (w_i, w_{i+1}, \dots, w_j) $, with $ 1 \leq i \leq j \leq n $. Two segments $ W[i_1..j_1] $ and $ W[i_2..j_2] $ are *equal* if: - $ j_1 - i_1 = j_2 - i_2 $ (same length), - $ j_1 \geq i_1 $, $ j_2 \geq i_2 $, - $ \forall t \in [0, j_1 - i_1],\ w_{i_1 + t} = w_{i_2 + t} $. An *abbreviation* is defined by selecting a set $ S $ of at least two non-intersecting equal segments $ W[i_k..j_k] $, each of length $ \ell = j_k - i_k + 1 $. Each such segment is replaced by the string $ A = \bigcup_{t=0}^{\ell-1} \{ \text{upper}(w_{i_k + t}[0]) \} $, i.e., the concatenation of the first uppercase letters of the words in the segment. The length of the abbreviation string is $ \ell $. **Constraints** 1. $ 1 \leq n \leq 300 $ 2. Each word consists of lowercase Latin letters. 3. Total text length $ \leq 10^5 $. 4. At most one abbreviation may be applied. 5. The abbreviation must replace at least two non-intersecting equal segments. **Objective** Find the minimum possible length of the text after applying at most one valid abbreviation. Let $ \text{orig} = \sum_{i=1}^n L(w_i) + (n - 1) $. For any pair of non-intersecting equal segments $ W[i..j] $ and $ W[i'..j'] $ with $ \ell = j - i + 1 \geq 2 $, let: - $ \text{save} = 2 \cdot \left( \sum_{k=i}^j L(w_k) + (\ell - 1) \right) - 2 \cdot \ell $ (original length of two segments minus the length of two abbreviations). Since each segment of $ \ell $ words has $ \sum_{k=i}^j L(w_k) + (\ell - 1) $ characters, and is replaced by a string of length $ \ell $, the total reduction from replacing two such segments is: $$ \Delta = 2 \cdot \left( \sum_{k=i}^j L(w_k) + (\ell - 1) \right) - 2 \cdot \ell = 2 \cdot \sum_{k=i}^j L(w_k) - 2 $$ (since $ 2(\ell - 1) - 2\ell = -2 $). Then, the resulting text length is $ \text{orig} - \Delta $. We seek: $$ \min \left( \text{orig},\ \min_{\substack{ \text{non-intersecting} \\ \text{equal segments } W[i..j], W[i'..j'] \\ \ell = j-i+1 \geq 2 }} \left( \text{orig} - \left(2 \cdot \sum_{k=i}^j L(w_k) - 2 \right) \right) \right) $$ Equivalently: $$ \boxed{ \min \left( \text{orig},\ \text{orig} - \max_{\substack{ \text{non-intersecting} \\ \text{equal segments } W[i..j], W[i'..j'] \\ \ell \geq 2 }} \left(2 \cdot \sum_{k=i}^j L(w_k) - 2 \right) \right) } $$
Samples
Input #1
6
to be or not to be
Output #1
12
Input #2
10
a ab a a b ab a a b c
Output #2
13
Input #3
6
aa bb aa aa bb bb
Output #3
11
API Response (JSON)
{
  "problem": {
    "name": "F. Abbreviation",
    "description": {
      "content": "You are given a text consisting of $n$ space-separated words. There is exactly one space character between any pair of adjacent words. There are no spaces before the first word and no spaces after the",
      "description_type": "Markdown"
    },
    "platform": "Codeforces",
    "limit": {
      "time_limit": 1000,
      "memory_limit": 262144
    },
    "difficulty": "None",
    "is_remote": true,
    "is_sync": true,
    "sync_url": null,
    "sign": "CF1003F"
  },
  "statements": [
    {
      "statement_type": "Markdown",
      "content": "You are given a text consisting of $n$ space-separated words. There is exactly one space character between any pair of adjacent words. There are no spaces before the first word and no spaces after the...",
      "is_translate": false,
      "language": "English"
    },
    {
      "statement_type": "Markdown",
      "content": "你被给定一个由 $n$ 个空格分隔的单词组成的文本。任意两个相邻单词之间恰好有一个空格字符。第一个单词前没有空格,最后一个单词后也没有空格。文本的长度是其中字母和空格的总数。$w_i$ 表示文本的第 $i$ 个单词。所有单词仅由小写拉丁字母组成。\n\n我们用 $w [ i.. j ]$ 表示由单词 $w_i, w_{i + 1}, \\dots,h, w_j$ 组成的单词段。两个单词段 $w [ i_...",
      "is_translate": true,
      "language": "Chinese"
    },
    {
      "statement_type": "Markdown",
      "content": "**Definitions**  \nLet $ n \\in \\mathbb{Z} $ be the number of words.  \nLet $ W = (w_1, w_2, \\dots, w_n) $ be the sequence of words, where each $ w_i \\in \\Sigma^* $, $ \\Sigma = \\{a, b, \\dots, z\\} $.  \nLe...",
      "is_translate": false,
      "language": "Formal"
    }
  ]
}
Full JSON Raw Segments