Hi Peter,

I will try to explain this at an intuitive, although not rigorous, level.

If $A$ and $B$ are sets, $A^B$ represents the set of functions from $B$ to $A$. The notation is justified by the fact that, if both sets are finite, with $|A|=m$ and $|B|=n$, then $A^B$ is a set of $m^n$ elements.

When we are trying to define the Cartesian product of a family of sets, we must take into account the fact that some of these sets may be identical. For example, if we look at $A\times A$, we cannot simply consider the set $\{A,A\}$, because this is just $\{A\}$ : in a set, each element occurs exactly once. We must somehow be able to distinguish the first $A$ from the second $A$.

To do this, instead of considering simply a set of sets, we consider a family of sets. We must assume that each set in the family is uniquely identified by some label or index; the set of indexes, denoted by $I$ in this case, is called the indexing set. We write $X_i$ for the set corresponding to the index $i\in I$, and we denote the family by $(X_i)_{i\in I}$. Note that:

- The same set may occur more than once in the family: we may have $X_i = X_j$ for $i\ne j$.
- In many examples, you would have $I=\{1, 2, \dots\}$, but $I$ can be any set, even an uncountable set like $\mathbb{R}$. Here, to illustrate the concepts, I will only use the first case, but the possibility of the second case is one of the reasons for this elaborate definition.

We want elements of the Cartesian product to be the analog of ordered tuples. This means that, for each element of the Cartesian product, we must take one element for each set in the family. Now, a set $X_i$ of the family is identified by its index $i$, and we will denote the corresponding element by $x_i$, with $x_i\in X_i$.

This means that, for each index $i$, we choose exactly one element $x_i$. This is the definition of a function $f$ defined on the set $I$, such that $f(i)=x_i$.

The co-domain of that function is the set of all possible elements to choose from, and this is $\bigcup_{i\in I}X_i$. This shows that we have a function $f:I\to\bigcup_{i\in I}X_i$, and the set of all these functions is $\left(\bigcup_{i\in I}X_i\right)^I$, as explained at the beginning of this answer.

If, for example, $I = \{1, 2, \dots\}$ an element of the Cartesian product would be a function defined on $I$ such that $f(i) = x_i\in X_i$, and we can write that simply as a sequence $(x_1, x_2,\dots)$.

Note that, altough the co-domain of $f$ is the whole set $\bigcup_{i\in I}X_i$ (the set of all possible choices for the $x_i$), the image of $f$ may be smaller if the sets are not all the same. Indeed, we can only take $x_i$ in the corresponding set $X_i$, and this explains the definition $\{ x \in ( \bigcup (X_i) )^I \ \lvert \ \forall j \in I , x_j \in X_j \}$.

Does this begin to clarify things ? Feel free to write back if you require further help.